Sun and Amazon jump into the pool together
May 5, 2008
At JavaOne, one of the big announcements was a hookup between Amazon, specifically EC2, and OpenSolaris (finally generally released as a full open source OS). The collaboration between Amazon and OpenSolaris will give customers access to OpenSolaris (for feree) and MySQL premium technical support, and more. The key selling points are ZFS and D-Trace. Now, I am a big Linux guy, but options are always good and enterprise relationships/partnerships are just a sign of the maturing and relevance of cloud computing.
Aside. It’s interesting that Sun talks about OpenSolaris as the OpenSolaris community
Technorati Tags: Cloud Computing, Amazon Web Services, Sun Microsystems, OpenSolaris, ZFS
And I am very skeptical
July 4, 2007
Via TechCrunch
Orbo produces free, clean and constant energy - that is our claim. By free we mean that the energy produced is done so without recourse to external source. By clean we mean that during operation the technology produces no emissions. By constant we mean that with the exception of mechanical failure the technology will continue to operate indefinitely.
The sum of these claims for our Orbo technology is a violation of the principle of conservation of energy, perhaps the most fundamental of scientific principles. The principle of the conservation of energy states that energy can neither be created or destroyed, it can only change form.
Because of the revolutionary nature of our claim, not only to the world of science but to the world in general, Steorn issued a challenge to the scientific community in August 2006 to test our technology and report their findings. The process of validation that has resulted from this challenge is currently underway, with results expected by the end of 2007.
What do all of you think? Me, I think the laws of thermodynamics are not going to be broken. I thought perpetual motion machines and their ilk were a dead horse
Technorati Tags: Orbo, Thermodynamics, Conservation of Energy
Answering some mathematical questions
May 23, 2007
Keith Robison started this and Sandra answers the following questions
1. Are you a biologist, if so what kind?
2. What math did you take in college?
3. What math do you use?
4. What math do you wish you’d studied?
5. How do you use math in your job (or research)?
(Note: I still can’t call it “math”. It will always be “maths”)
Onto the answers
1. These days I am outside looking in, but I am a trained physical chemist, with the bulk of my research done on protein structure, dynamics and photophysics, so I consider myself a biophysicist/structural biologist
2. Going to school in India, by the time I got into college, I had a pretty thorough grounding in trigonometry, probability, algebra and calculus. In college just more advanced versions, especially calculus. Only a little formal statistics. That was mostly self taught
3. Pretty much all my calculus, algebra and probability. As a physical chemist, you don’t really have a choice.
4. More statistics and discrete mathematics
5. In graduate school, most of my research was quantum chemistry and molecular simulation, so lots of calculus, probability, and algebra. Since then, algebra, statistics, and probability. These have been applied for bioinformatics (hypothesis testing, etc), algorithm development (mostly in bioinformatics) and for doing a lot of free energy calculation , and these days lots of business modeling.
Technorati Tags: Mathematics, Biology, Education
Coarse graining molecular simulations
February 11, 2007
In the past I’ve talked about Elastic Network Models (ENM) at bbgm. These can be looked at as a coarse grained scheme that allows scientists to look at motions in time scales beyond the traditional realm of atomistic molecular dynamics simulation. In fact, over the past few years, the trend towards multiscale simulations has increased significantly. This is partly due to the fact that increased compute power is making it possible to study larger systems at longer timescales, but also because it is becoming increasingly necessary to approach a problem from multiple directions. Multiscale modeling can be simplified into the following scheme (in length scale)
Electrons (Quantum mechanics) –> Atoms (Molecular mechanics) —> Segments/Reduced representations (Coarse grained/Mesoscale) –> Continuum/Bulk (Continuum dynamics/Finite element methods).
In addition to the length scale, these are also fairly representative of the addressable time scale (fs –> ns –> ms –> higher respectively) In this post, I will talk about the move towards addressing longer time scales and larger systems using various coarse graining techniques.
Normal mode analysis (NMA) is a common way of trying to look at longer timescale vibrational motions in molecular systems, e.g. proteins. However, NMA is computationally expensive, which limits the sizes of the systems to which it is applicable and the number of modes that can realistically be calculated. Qiang Cui and others have developed Block Normal Mode methods which use a sparser Hessian and can be applied to larger systems which provide some computational advantages. Personally I think BNM type approaches might be very relevant to the sort of MM-PBSA methods described in the previous article. However to study large scale cooperative motions ENM is probably the best method that I know of.
One of the problems that began to interest me a couple of years ago were ways that people were using to take results from atomistic simulations, usually MD simulations of biomolecules or polymers and transfer that information into coarse-grained models that could be used to study molecular machinery and look at mechanical properties of biomolecular system. Much of that interest came after listening to a talk by Greg Voth. Protocols to go from quantum mechanical representations of model systems and transfer those parameters to classical molecular mechanics force fields have been fairly well established for a while now. On the other hand, the jump from molecular systems to the kinds of coarse graining schemes used to represent more macroscopic motions and properties are relatively less well understood, especially for biomolecular systems. Mesoscale approaches for polymer mixes have been used for a while, but are still a work in progress (Disclaimer: I was actively involved in this area towards the end of my stay at Accelrys and some of the most knowledgeable people in the field still work there). For biomolecules the systems create their own challenges and a chunk of the community is actively looking at this problem.
One of the key challenges in any such coarse-graining scheme is the need to retain as much information content as possible without losing the advantages that coarse-graining brings. The general approach is to reduce the system into some coarse-grained representation and then represented the forces between the reduced segments in a meaningful way. Much of the challenges lie in part two. What I learnt from Voth’s talk and subsequently from reading his papers was the lack of information content in most methods in place. The other thing that started becoming apparent was how difficult generalizing CG potentials was, if at all. Voth’s approach to coarse graining is a method he calls Multiscale Coarse-Graining (MS-CG), which is used to systematically derive a coarse-grained potential from atomistic-level interactions.
The method used by Voth is called force matching, a method that was developed for condensed-phase systems. The system uses a trajectory to derive a pairwise effective forcefield. The method is agnostic of how the trajectory is generated, the most common method being atomistic MD trajectories. I won’t go into the detailed derivation here, but here are some of the features that jumped out at me
(a) The authors fit their CG force field to a number of shorter system MD trajectories and average over those. Previous techniques used whole trajectories
(b) CG sites are associated with the CM of the underlying atomic groups. Applying the FM procedure to these data yields the effective interaction between the CG sites as it is present in the underlying atomistic simulation.
This is not a mandatory selection. The geometrical center could also be chosen as can a hybrid approach.
A lot of the work on multiscale biomolecular simulations has been done on lipids, vesicle formation, etc. To generate the FM data, MD simulations in explicit solvent (TIP3P) are carried out (in the original paper at least), and the Particle Mesh Ewald (PME) method was chosen to model long range electrostatics. This system was then coarse-grained (using the center of mass approach) and the force matching procedure was applied to 4000 configurations from a 40 ps trajectory (which is fairly short). The authors found that their approach was able to reproduce the structural properties of the lipid bilayer quite accurately, and the input data is not any different from a typical MD simulation. For more complex systems, one might need to be more creative about the CG procedure. Automation of any such procedure would be a must for large scale application as well.
This is of course not the only method. Work by Julian Shillcock, Mikko Kartunnen, Qiang Cui, Aatto Laaksonen, etc should also be considered, but I have always found Voth’s approach to be the most elegant.
Thus ends the formal part of Just Science week. As Arunn has mentioned, blogging about pure science is a lot harder than it sounds. I found a lot of great science blogs though and got a lot of traffic this week, so at least a few people are interested in biomolecular simulation and protein structure prediction. Now .. back to our usually scheduled programming.
Further Reading:
Multiscale modeling of MscL
Technorati Tags: Just Science Week, Multiscale Modeling, Greg Voth, Biomolecular Simulation, Molecular Modeling
Evaluating protein-ligand interactions
February 10, 2007
This week has definitely been one of past lives. At this rate I might just end up writing out electrochemistry.
Today we take a look at evaluating protein-ligand interactions. I’d like to start with a quote from Tack Kuntz that I have used in many presentations (from Science (1992), 257, 1078)
The central assumption of structure-based design is that good inhibitors must possess significant structural and chemical complementarity to their target receptor
There are two basic approaches to trying to identify appropriate inhibitors. Ligand-based approaches as commonly used in pharmacophore modeling use physical and chemical traits of known ligands to try and identify novel inhibitors, which have similar features. Receptor-based approaches, which I am more familiar with and will talk about, look at things from the other viewpoint. Here, the idea is to find ligands that use structural and other features on the target receptor to identify the best inhibitor. The quote above describes the intent of most of the techniques in place to try and identify suitable inhibitors.
Docking is probably the best known of methods used to identify the fit between a receptor and a potential ligand. Today, docking is used primarily as part of virtual screening protocol, wherein a database of ligands is screened against one or more target receptors.Docking actually consists of two distinct parts, the “docking” part, which is the search scheme to identify suitable conformations or poses, and “scoring”, which is a measure of the affinity of various poses. Someone I know put it like this. Docking schemes try and find local minima, while scoring functions try and look at the interactions from a global perspective.
One of the reasons it is necessary to use such a two-stage approach is the difficulty in sampling conformational space sufficiently. Most docking methods use a number of heuristics and usually only add conformational flexibility to the ligand (keeping the protein rigid). In order to account for the rigidity of the receptor and to enable the lowering of energy barriers, a number of docking methods use softcore potentials which allow some overlap of Van der Waal’s radii. Advanced search schemes, e.g. CDOCKER, a grid-based MD procedure, are being implemented for improved sampling of conformational space (Disclaimer: I used to be responsible for molecular simulation products at Accelrys). There is a significant effort to try and take into account the flexibility of the receptor. Methods range for techniques that explicitly allow receptor sidechains to move (usually within the neighborhood of the binding site), to selecting receptor conformations from MD trajectories, to methods that address induced fit during docking, to using multiple target structures during the docking procedure.
While docking is generally considered to be sufficiently accurate for most cases, the quality of scoring functions is a matter of much debate. It is generally accepted that scoring functions are not sufficiently accurate for sufficient enrichment from virtual screening procedures. While a number of good scoring functions have been developed over the years, they tend to be system dependent and their accuracy varies. This has led to a move away from knowledge/rules-based approaches to more physically relevant techniques. While this does have an impact on the speed of virtual screening, more expensive techniques can be used as part of a hierarchical procedure to get improved results. Research done at Roche (and some preliminary work that we did at Accelrys) has shown that methods such as the MM-PBSA technique and Linear Interaction Energy (LIE) approaches are quite
good at rank ordering ligands. I am not sure how generalizable these methods are, but implementations like those being implemented in Pipeline Pilot do make application much easier and it is possible that pharmaceutical companies will increasingly begin to use such methods in addition to classical scoring functions like GoldScore, GlideScore, LigandScore, etc. The MM-PBSA method, developed by the late Prof. Peter Kollman uses a molecular mechanics to represent intermolecular and intramolecular forces and the Poisson-Boltzmann solvation model to calculate the electrostatic component of the binding free energy. Entropic terms can also be added, but in practise are usually ignored. The Roche study and a lot of the work that we did, did not use molecular dynamics for sampling. It is my belief that in a lot of cases, sampling clusters of pose space and using simple unconstrained energy minimization will be sufficient (the energy minimization can be done in explicit sovent). The LIE technique popularized by Johan Åqvist by historically uses the scaled sum of the differences in the non-bonded interaction energies between the bound state and a reference configuration to estimate the binding free energy. This method is very easy to implement and while more empirical, has the potential to replace some of the scoring functions used today (or enhance them as in the ELR method developed by Jorgensen).
I can’t believe I’ve come this far and not said that docking and scoring are, in the end, methods to evaluate molecular recognition. The ultimate approaches for doing so today would be thermodynamic techniques, such as the Bennett Acceptance Ratio, thermodynamic integration (TI), etc that use massive molecular dynamics simulations and sophisticated sampling schemes, usually in explicit solvent to solve the complete thermodynamic cycle. These methods are limited by the quality of the underlying force fields and the completeness of the underlying sampling.
The field has advanced a lot in recent years, with academic and commercial efforts underway to take advantage of computing infrastructures, faster machines, improved theory, etc to get better results. The need for more efficient lead discovery is driving the development of in silico virtual screening methods and I suspect that the next few years will see a change in the quality of the results and the easy of use of more advanced techniques. Trying to look at better methods of evaluating protein-ligand interactions was one of the most fun subjects I have ever tried my hand at.
Note:No mention of polarizability and quantum mechanics in the above article, but that’s another area of active research.
Technorati Tags: Just Science Week, Protein-ligand interactions, MM-PBSA, LIE, Docking, Structure-based drug design



