The value of information

October 16, 2007

About a year ago, I wrote a post on access and monetization of biological content. In that post, I talked about the value of scientific content and the possibilities of monetizing that content.

Which brings us back to monetization? Some would argue that biological information belongs in the public domain and should be accessible freely. I agree, but it depends on how one defines information. Gene sequences, protein structures, etc, do belong in the public domain, but using that information to make decisions, products, and come to all kinds of conclusions is where the fun lies.

Today Tim O’Reilly said something that troubled me greatly. A day ago, he wrote an absolutely brilliant post questioning the value of the herd mentality seen on sites like Techmeme. Today, he adds to his thoughts, that connected systems are smarter, but only to a point. There are some very interesting points in his post

And that brings me to another important point: precisely because once everyone “knows” the same stuff, there is no competitive advantage to be had from learning it, eventually that over-known area loses its importance. Even in cases where there is no bubble, one real source of competitive advantage is knowing something others don’t. Web 2.0 has been about unlocking value by sharing what we know, but that doesn’t mean that keeping information private has lost its value as a technique. It’s just temporarily out of favor.

and (emphasis mine)

What does this tell us about the future of Web 2.0? It tells us that the pendulum is going to swing from public data to private. I’m not sure when this will happen, but I’m quite confident in the trend. (If you see signs of areas where this is already happening, be sure to let us know.) There’s still a LONG way to go in making all the world’s information accessible, but maybe one sign of the maturity of the Web 2.0 trend will be when more companies are started on the premise of keeping information hidden than on building publicly accessible repositories that grow through user contribution.

Does this mean that we are moving to a world, where data will get locked into the kind of silos we have long suffered from in the life sciences (e.g PMR’s post)? I hope not, and I don’t believe that Tim was referring to all types of data either. I have blogged several times about the limited inherent value of data. “Data” in this context refers primarily to raw data, e.g. the genomic sequence of an organism (I don’t mean to diminish the challenges in collecting these data). One could generalize that to mean any commoditized data, e.g. various expression signatures related to disease are likely to become a commodity some day, or a list of all known protein domains. Information from the data is another story. Within that information lies your differentiator, and therefore, in a knowledge economy, the value of an organization. If I may go with a glass half full outlook, I believe a shift that will take place in that raw, basic, data will all go out into the public domain, and the emphasis will be on building value by capturing the meta-data and information locked up in those data.

What does this imply about collective intelligence? Doesn’t the concept of collective intelligence contradict Tim and myself? I am not sure it does. Methods that leverage publicly available datasets are in essence capturing the collective intelligence in those data sets, but mining them for information. In an expert setting, like the life sciences, the intelligence in the myriads of publications is available to us. Those who learn to glean maximal information from those resources then have a choice. To keep it out in the public domain, or keep it private. In either case there are business models to be explored.

I will take the example of companies that mine literature for pathway information. The literature information, in theory, is available to all. Each company in the space has its own approach to mining and deploying the knowledge about pathways that they derive from literature and other sources. In a way, they are harnessing collective intelligence, and keeping useful data, their differentiator, private. The ones that do a better job are the more successful ones. On the flip side companies providing information on toxicogenomics have been unable to provide value and have struggled. Once again, the market determines what is valuable and in what form. Does this mean toxicogenomics has no value? Not at all. It just means that the way it was being provided and the business model being used to provide it was not appropriate. That is a story for another day.

Even pharma companies have realized this. Novartis decided to make the results of a whole genome association study publicly available. Did they feel the data had no value? I doubt it. One, the study was too large to have been carried out by a single entity. They collaborated with academic institutions, but results from other collaborations have not been made public before. One possibility is that there is so much information to be gleaned from the very large volume of data, that one organization cannot possibly do so. In addition, a pharma company like Novartis is after only part of the underlying information. I think it is a sign of the times that they are not locking everything into a silo, but are willing to share the rest of the information from the community. Science benefits and in the long run Novartis might benefit too.

So where does that leave us? I believe that we won’t go back to a world of closely guarded proprietary information. What is more likely is a situation like the one described above, a mix of data in the public domain and meta-data and other information that may or may not be kept private depending on the business models chosen.


Technorati Tags: , , ,

Comments

Viewing 3 Comments

    • ^
    • v
    I agree with you. Once the information genie is out of the bottle, you can't stuff it back in. The reason is simple - no matter how many people think information should not be free it only takes one person in the world to publish it. And with plenty of options for free hosting and indexing there is no cost barrier to that.

    Just take what has happened to open course content. Many teachers wwere (and still are) paranoid about keeping their online class content closed. But it only takes one teacher in the world in a particular discipline to blow that up. A year after I started posting my recorded lectures in organic chemistry online, several other professors did the same thing, further enriching the global learning base. The same thing with textbooks - there are now several high quality university level organic chemistry textbooks freely available online. But most publishers still think that copyright control is the only card they can play.
    • ^
    • v
    There are always differentiators. Just the fact that you are the fist to develop a new method in the lab, even if you fully describe the method when you publish it your still in the leading edge and that gives you an advantage in working with that method or in making improvements.
    I don't really agree with that idea that once everyone known something that there is no longer a competitive advantage in knowing that. No person can know everything and what a person or group does with that shared public knowledge is strongly dependent on the groups background. I think that in a very open world of knowledge it becomes increasingly difficult to be original as an individual because it is likely that someone out there has the same knowledge and mindset and ideas. However, as a sum we should be collectively more creative. There are many more different melting pots and discussions over this shared knowledge.
    Even the simple act of having open discussions about this shared knowledge can be of value. We can learn how to argue our positions, to spot weaknesses in logic etc from observation.
    • ^
    • v
    Pedro,

    Yes there are advantages to be first to market, but first mover advantage only goes so far, especially in the absence of protection (e.g. Google was not the first search engine).

    As far as knowing everything, that is contextual. For example, back in the late 90's there was value in providing pharma companies with target information, but after a while that became a commodity and therefore not a sustainable business. Didn't mean target information was not valuable anymore, but the value to pharma companies was not such that you could build a business on it.

    As far as the collective being more creative, I definitely hope that is true, especially if the contribution from parts is diverse. It's an argument that is being made today as the industrial R&D; model changes. It is not possible for one organization to contain the requisite knowledge to be successful.
 

Trackbacks

(Trackback URL)

close Reblog this comment
blog comments powered by Disqus