A new model for CASP
December 29, 2006
The CASP “competition” (don’t let the organizers tell you otherwise) over the years has been one of the biggest players in the protein structure prediction world. It single handedly launched David Baker into a new stratosphere some years ago when his ab initio folding program proved to be significantly better than other methods (The quantum chemist in me always cringes when ab initio is used in structure prediction. It was good to note that the CASP organizers are now calling it template free prediction). Entire labs shut down for months, focusing almost exclusively on CASP. The exercise was a great idea, especially in the late 90’s and the early 00’s when homology modeling, threading and ab initio prediction were still evolving. Over the years though, I think it has lost a lot of its bite. For one, the hot area is protein-protein docking. The second has to do with the Protein Structure Initiative. High throughput structure prediction, or structural bioinformatics, was at its zenith between 1998 and 2002 at a time when high throughput crystallography was at its academic and commercial peak. The field is still relevant today, but a lot of the buzz is gone. All the easy structures are done, the tough ones are left, and the focus has shifted to more efficient, robust and targeted structural biology. One might have thought that structure prediction would focus on quality at this juncture, and to some extent it has, but as recent CASPs have show there has been very limited success. This is unfortunate since improved structure prediction can prove to be invaluable to the whole drug discovery process.
Getting back to the original topic of this post. Today we have a far different environment than when CASP was originally started. The CASP categories have evolved with time, but I am not sure the process has changed that much. Instead of the 2 year competition nature of the experiment, it would be interesting if the program became more dynamic, i.e. there would be list of structures eligible for prediction for a certain duration, following which they are moved into the archives. Any attempts at predicting the structure are submitted to a central server, where certain automated metrics can be used to analyze them and all the raw data should be made available to the public, so that different people can slice and dice it the way they like. This way, instead of making CASP and the bragging rights that go with it the focus of research, people can actually focus on developing methods and use CASP structures as an excellent testing ground for those methods. Then, every two years, at the CASP meeting, groups can get together to analyze the progress, if any, many in the previous two years, with perhaps some groups presenting analysis data, again looking at the dynamic results. Also, each CASP group should maintain a blog or wiki, where their methodologies or the details of the methods used minus any trade secrets are listed and people can add their own comments. Using technologies like RSS, blogs and wikis, it would be interesting to see how the community functions.
If anyone participating in CASP is reading this, what problems could you see with a more dynamic approach?
Technorati Tags: CASP, Protein Structure Prediction, Structural Biology, Bioinformatics
Digg-ing science
December 27, 2006
Last night, I was listening to a episode of TalkCrunch where Mike Arrington , Robert Scoble and Om Malik were having a discussion on technology trends and startups. One of the subjects that they covered towards the end of the podcast was Digg and the impact it might have on traditional media (it was an older episode). They talked about how they used Digg and where it helped them, etc, as part of a discussion on the impact it would have on media.
The discussion got me thinking about Digg and science, a subject I have not seen discussed elsewhere. Digg has always had a science section, but I have only used it sparingly. If one looks at the science page, two things jump out at me
- The number of Diggs does not quite match the other sections (which is not surprising)
- The quality of stories is not quite the best
Why is that? Before I go any further, a qualifier. I love Digg conceptually, but am not a heavy user of the site, since I like getting my information through blogs, del.icio.us, techmeme, postgenomic and good old Google. The problem with Digg is that as it as it has become more popular, the level of childishness in the comments has gone up, and that rubs me the long way. I do like using the swarm and the stack, both nifty little features. The new design also makes it very easy to track the upcoming section, perhaps a more valuable resource than the front page itself. Getting back to science, the quality of the top Dugg stories varies wildly. I would say that the best stories that end up making their way to the front page tend to be the type of stories that a magazine like Scientific American might carry. In between those are what one could call tabloid science. The Upcoming section actually has some excellent stories. Based on a very unscientific sampling, I would say that the quality of the general stories promoted to the front page has improved, definitely from the general science perspective, but I am not sure that my blogging brethren are quite that well represented. I wonder if anyone noticed a “digg” effect for their science blogs? I am curious to find out how the scientific blogosphere and digg might benefit from each other. The best general resource available to scientists, especially those that are biologically inclined, is Postgenomic, but that is closer to techmeme in concept (a service I prefer to Digg) and not the kind of real time resource that Digg can be. It is an excellent source of ideas and to identify hot button topics among the scientific blogosphere. But it is a resource for bloggers and the niche of people that follow those blogs. What I feel is missing is a resource that allows the opinion of bloggers available to masses outside the scientific blogosphere. Bloggers like Sandra Porter or Derek Lowe should be read by a mass audience, and can really benefit from the kind of impact Digg has.
The questions then are, does a service like Digg (doesn’t have to be Digg itself) have a place in expanding the reach of scientific bloggers? Is it something that is even desirable? I would love to find out what people think.
Technorati Tags: Digg, Science, Blogosphere, Postgenomic
An open scientific future
December 25, 2006
I have been meaning to write a commentary on a couple of posts on the open future of science at 3 Quarks Daily for some time now. What better time than the end of the 2006, a year in which open science has become quite the recurring theme in these parts. Bill Hooker in Part 1 of an essay on open science, does a wonderful job of putting Open Access into context, especially for someone perhaps unfamiliar with the Open Access movement. In Part 2 , he talks about Open Science, a much fuzzier term as you can probably make out. Once again, he does an excellent job of going through the various interpretations of the phrase and what it means to different people. For those who want to find out about Open Access and Open Science, I strongly encourage reading the two pieces. They are a rich source of links and definitions, especially for those who might be curious to dig deeper into the subject. What I would like to do is provide a more personal take on the subject.
There are many drivers for opening up science. One of the biggest challenges that science, especially academic science, faces is that much of it is funded using taxpayer money (via organizations like the NIH ). In addition, there are some who believe that academic science inherently belongs in the public domain. I would add another twist to the tale, one that has to do with publishing and data transparency. In the current “publish or perish” system, the pressure to publish is constantly increasing, leading to what almost seems like a conveyor belt of publications in certain hot fields. There are several problems with this system. A lot of papers that do not belong in peer reviewed journals make it there. In addition many papers only include the interpretation of the data and some methods. As experiments become more complex and technology is pushed to its limit, it has become increasingly difficult for scientists to be able to reproduce work done by others. One of the biggest drivers for open science, regardless of how it is defined, is the transparency that it brings to the scientific process. Fundamentally most scientists are interested in furthering knowledge and I think the general attitude of scientists, where they are always afraid of being scooped, or reluctant to share data is misplaced and against their own general convictions. It is for that reason that I think that over time academic research will increasingly start moving in the direction taken by the Creative Commons license. PLoS One has already taken a step in this direction, and as scientists start using Wikis and blogs to make their research less of an insular operation, I feel that there should be a framework under which it is practiced. Science Commons is one possible way of doing it, rather than every person just using their own approach to sharing their research.
What I talk about above though is only a framework by which scientific information, methods, data, results, can be openly shared. Open science in the end is about how science is practiced. In particular, the concept from Worldchanging where they make an analogy to open source software resonates well in these parts. Assuming the framework exists, I do think there is a possibility to encourage networked science, involving scientists from around the world. There are some challenges, e.g. funding, government laws about cross border research and so on, but perhaps openly licensed, standards driven science can become the kind of stimulus to science that open source software and recent technological developments have been to the internet and user-generated content. For starters, an oft-repeated plea. The resources available for scientists and researchers to get their work in the hands of their peers are better, easier to use and more accessible than ever before.
Further Reading:
The Future of Science is Open, Part 1: Open Access
The Future of Science is Open, Part 2: Open Science
The Redfield lab
Technorati Tags: OpenScience, OpenData, ScienceCommons, CreativeCommons
Things I noticed #17
December 22, 2006
I have not been too good with my roundup lately. I am actually considering a change in format early in the new year (but I am going to keep that close to my chest, since I have been known to come up with grandiose ideas in the past). So this week we have!!!
The Canadians must be on to something
From the Toronto area, comes this piece of news about a TV show called ReGenesis which seems to have some real science in it. I think any show that talks about “molecular medicine” is worth a look see.
A non-profit innovation network
Innocentive is launching a non-profit innovation network according to the Business Innovation Insider. Perhaps this is more in lines with the open innovation that was discussed earlier.
That’s what happens with dragons
I am sure you’ve heard the whole story of the virgin dragon by now. The best place to read about it? Sandra Porter talks about the subject while giving us a lesson on genotyping along the way. Wonderful!!!
Docking & Scoring
After spending some time in recent years devouring this subject, I thought I wasn’t going to think about it for a while, but Milo’s post got my attention
Women in science
Tara C. Smith has had a series of articles on this subject recently, stemming from a strong reaction to this rather curious post (I still don’t get the purpose behind it).
Way to go, Jeremy
Jeremy Allison, better known for his Samba roots has made a statement. Techmeme pointed me to the news of his resignation from Novell in protest of the Microsoft-Novell patent agreement. The commentary on Groklaw is in line with my thoughts on this matter (the comments make for some interesting reading too).
Update: As Krish mentions in the comments below, Jeremy is now at Google
Proof that I am a geek
I found this very cool
Technorati Tags: Komodo Dragons, Virgin Birth, Science, Computational Chemistry, Molecular Modeling, Geeks, Microsoft, Novell, Jeremy Allison, Linux
PLoS One - The implications
December 20, 2006
This blog supports open access, the creative commons license and efforts like PLoS One. The challenge facing the community now is simple - given the ability to use modern technology to develop a new paradigm for publishing, can we engage the rest of the scientific community to participate in this scientific conversation? PLoS One allows you to annotate (something I quite like) sections of a paper and also see other annotations and participate in discussions. Without participation from readers and users of the science, this experiment will not work, so my request to the community. Participate, it is better for science if we do so.
Update: As Krish points out in the comments, Nature’s experiment has come to an end due to a lack of participation. I am going to look at it with a glass half full and say that while the spirit was good, the implementation could have been better. I think with PLoS One, the site design and the marketing around it facilitate more communication. Now, its time for people to take advantage of it
Technorati Tags: Open Publishing, Open Access, PLoS One, Creative Commons, Conversation


