I read this first in Bioinform, where it was reported that from October 15, 2006 the PDB will no longer accept theoretical models. The decision was the result of a workshop held last fall, the results of which have been published in Structure
As someone long involved with protein structure prediction, I should be apalled at this slight, but that is not the case. Kevin Karplus states the obvious when he states that the PDB should focus on quality and not on quantity, the latter somewhat of an offshoot of various genome sequencing projects and the Protein Structure Initiative where homology modeling became an excellent tool to model vast numbers of newly sequenced proteins with unknown structure. While structure prediction has made great leaps in the past decade, with the ability to model very distant homologs, the development of threading techniques, and the wonderful work done by David Baker on ab initio structure prediction, the quality of structures is a far cry from that required for many applications. Fold prediction is a good method for classification, and structure prediction is useful for developing functional hypotheses. The field would benefit greatly from improved methods for loop and sidechain prediction, the former being a serious issue.
It’s clear that computation has played and will have an even greater role to play in structural and functional biology, especially as we try and understand allosteric effects, design better drugs, and evaluate protein-protein interactions. Therefore it is important that the community come together, not in the competitive framework of CASP, but rather in a more congenial, cooperative framework, where the discussion should focus on standards, quality assessment criteria, and a centralized resource for computationally derived structures.
Where do I think the field will go? The late 90’s saw a huge jump in the quality and performance of protein structure predictions. Another quantum jump is required to take the field to the next level. A peer-reviewed knowledgebase linking computed structures to the PDB and other relevant classifications would be a good first step, and it looks like the Protein Structure Initiative might take the lead on this issue. I don’t have the paper, so I can’t confirm how much of the workshop focussed on methods. In terms of methodology, the focus will shift (is shifting) from throughput to quality, with a focus on molecular interactions and an increased use of physical potentials for improving structural quality. Improved treatment of the micro environment of sidechains and methods to generate more native-like conformations will gradually begin to hit the mainstream. Perhaps novel search strategies and algorithms that take advantage on new computer architectures and performance will become a focal point for some research groups. Computers are going to get faster and better. Our methods should adjust accordingly, and perhaps it makes sense to take a tiered approach with the method selection being dependent on the application.
Edit:: I forgot to add this part. One telling sign of the lack of trust in computational models is that whenever people create non-redundant structure libraries for comparative modeling, one of the first things that are tossed out as part of the QC procedure are any PDB entries that are computational.
When the rubber meets the road, a model is a model and an experimentally determined structure is a "real" structure -- with all the inherent caveats that come with concentrating a protein to the extent required for structure determination.
I've used models, I LIKE models -- they're helpful and thought-provoking. But if the PDB's mission is as a structure database, it shouldn't be littered (and I do mean littered) with models.
What's wrong with a separate db of homology models? Would researchers turn to it if Their Favorite Protein had not been determined experimentally? I think so, but no one wants to populate the second banana database -- just ask the BMRB folks.
Strictly speaking even a crystal structure is a model, since it fits the structure to experimentally observations, i.e. the electron density. .. and therein lies the problem. Take space travel for instance. All the calculations that figure out how we get to the moon, go around, it, etc are all "models", but the underlying physics is very well defined. When it gets down to molecular level detail, life gets a lot more complicated. We are some way away from theory that can describe molecular systems the size of proteins at the desired level of accuracy.
Structure prediction has a long way to go – The PDB says “no” to computational models
I read this first in Bioinform, where it was reported that from October 15, 2006 the PDB will no longer accept theoretical models. The decision was the result of a workshop held last fall, the results of which have been published in Structure
As someone long involved with protein structure prediction, I should be apalled at this slight, but that is not the case. Kevin Karplus states the obvious when he states that the PDB should focus on quality and not on quantity, the latter somewhat of an offshoot of various genome sequencing projects and the Protein Structure Initiative where homology modeling became an excellent tool to model vast numbers of newly sequenced proteins with unknown structure. While structure prediction has made great leaps in the past decade, with the ability to model very distant homologs, the development of threading techniques, and the wonderful work done by David Baker on ab initio structure prediction, the quality of structures is a far cry from that required for many applications. Fold prediction is a good method for classification, and structure prediction is useful for developing functional hypotheses. The field would benefit greatly from improved methods for loop and sidechain prediction, the former being a serious issue.
It’s clear that computation has played and will have an even greater role to play in structural and functional biology, especially as we try and understand allosteric effects, design better drugs, and evaluate protein-protein interactions. Therefore it is important that the community come together, not in the competitive framework of CASP, but rather in a more congenial, cooperative framework, where the discussion should focus on standards, quality assessment criteria, and a centralized resource for computationally derived structures.
Where do I think the field will go? The late 90’s saw a huge jump in the quality and performance of protein structure predictions. Another quantum jump is required to take the field to the next level. A peer-reviewed knowledgebase linking computed structures to the PDB and other relevant classifications would be a good first step, and it looks like the Protein Structure Initiative might take the lead on this issue. I don’t have the paper, so I can’t confirm how much of the workshop focussed on methods. In terms of methodology, the focus will shift (is shifting) from throughput to quality, with a focus on molecular interactions and an increased use of physical potentials for improving structural quality. Improved treatment of the micro environment of sidechains and methods to generate more native-like conformations will gradually begin to hit the mainstream. Perhaps novel search strategies and algorithms that take advantage on new computer architectures and performance will become a focal point for some research groups. Computers are going to get faster and better. Our methods should adjust accordingly, and perhaps it makes sense to take a tiered approach with the method selection being dependent on the application.
Edit:: I forgot to add this part. One telling sign of the lack of trust in computational models is that whenever people create non-redundant structure libraries for comparative modeling, one of the first things that are tossed out as part of the QC procedure are any PDB entries that are computational.
Further Reading
Managing structural genomics data
Technorati Tags: Protein Structure Prediction, Structural Biology, Protein Structure Initiative, Molecular Modeling, Algorithms, Structural Genomics
powered by performancing firefox