We need to change the system
August 10, 2008
I return to one of my favorite topics, open data and data ownership. Discussions with some very smart people over time (including a recent one on Friendfeed) have convinced me that our problem does not necessarily lie in scientists being inherently protective of their data, but rather to a system that encourages them to do so.
First let me throw out some oft-repeated mantras that drive my philosophy, mostly stolen from other wiser people.
Raw data by itself has is not the value center. Value comes from the interpretation of these data.
Data finds the data (then people find the people) (via Jeff Jonas and Jon Udell)
Wherever you are, there is someone smarter somewhere else (Via Tim Bray, channeling Bill Joy)
Now that we have got those thoughts out of the way, and lets assume that most people involved in science do care about science in general, and acknowledging that as humans we need recognition in some manner, the challenge lies not in trying to fit our desires into an existing, broken, system, but rather in taking this system, which is very long in the tooth and changing it.
The science blogosphere, The BioGang, etc are but a small part of the scientific communities. Some of us have the ability to make change from within, some of us have a bigger pulpit than others, and some of us can only write about the changes we would like to see. So it’s going to take a while, but if pharma companies can agree to share pre-competitive biomarker data, then academics can change as well.
I still maintain that raw data should be made public in a reasonable time. You might want to re-check the data quality, or perhaps your data was collected to support a hypothesis, and you have full right to test it out. But you can’t sit on that data. Complete your analysis and make it available. And if the data are collected for the sake of data collection (genome study, high throughput structure determination) then you must make it available ASAP. There is enough in there to keep many many people busy.
The other aspect is data ownership. Large data sets of fundamental data belong in the public domain. Supporting data, data that supports a paper, or some hypothesis or discovery, I am not 100% sure about. I think there needs to be some form of attribution, especially if you don’t plan to publish the data in a paper. How do we manage that? I don’t know. Others have studied this for a longer time. How does this protect long term monetization prospects? Actually I think that’s the easy part, and I’ve covered it many times before.
Sometimes I feel that it’s pointless to write about this subject, one I care about more than most. Then I remember how much I care.




Add New Comment
Viewing 13 Comments
Thanks. Your comment is awaiting approval by a moderator.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Add New Comment
Trackbacks
(Trackback URL)