Bio::Blogs #3
August 31, 2006
With half the world taking its annual August hiatus, I feared the worst, but in the end I have the pleasure of bringing to you the third edition of Bio::Blogs, the favorite blog carnival of those of us who like to mix our biology with computation or vice versa
This editorial was started at a Starbucks, only fitting since I move to Seattle next week. So in honor of my new abode, this edition of Bio::Blogs is like the coffee menu at a Starbucks, with offerings from around the world and an eclectic set of flavors.
We start with a report that required some globetrotting. All the way from Heidelberg, none other than Pedro has a report on the recently held Science Foo camp (held at the Googleplex). If there was ever such a thing as living vicariously, I did that through this (and other) reports. It would be wonderful to find out more about the discussions around the communication of science, data and web infrastructure, and the discussion on open science. One topic that seemed to be missing was the need for tools to accurately predict drug interactions with biological systems. I wonder if that ever came up?
From Berkeley, Chris Patil has a short article on advanced sequencing technology. I must admit this is the first time I have ever read anything about mitochondrial DNA arrays. In keeping with the theme of the Ouroboros blog, Chris talks about the implications for aging and age-related diseases. I suspect that in time there will be a number of researchers looking for age-related (and other) markers in mitochondrial DNA.
Evolgen writes about evolution and genetics. In this months entry, RPM writes about a rather pertinent subject, the rift in the biological sciences, the rift between “wet” biologists and computational biologists. I am sure almost everyone reading this carnival has probably felt this at some point in his/her career. It is a sad state of affairs that science can build walls between disciplines, especially at a time when silos are increasingly breaking down. I suspect that to a degree, our system is to blame. Computational biology in particular is a very recent field, pulling in mathematicians, computer scientists, theoretical chemists, etc. All too often computational (and bench) scientists work too in a vacuum. The post has some wonderful insights on how limiting that is, and why these divisions hurt the field more than anything else.
We get a little French flavor courtesy of Pierre at YAKAFOKON. For those familiar with the blog’s themes, it should come as little surprise that we get this wonderful little bookmarklet allowing people to make offprint requests from their browser.
I save the best for last … Sandra Porter, from my new home of Seattle, and the next host of Bio::Blogs, has a series of fabulous posts on (to quote Tara Smith) “using a bad virus to do something good”. You can tell that Sandra is an educator. The entire series, from which I learnt a lot, is a guide through a set of experiments on HIV evolution, an demonstration of how HIV can be used to show the process of evolution.
Part I. Gives some background info on HIV, a link to a nice animation of the life cycle, and presents the problems that being investigated.
Part II. Gives instructions for performing the experiment.
Part III. The results (free from any biased interpretation).
Part IV. Sandra’s analysis along with a quick look at the protein structure to see if we can explain why the mutant viruses are positively selected and why these mutations might lead to drug-resistance.
… and this months extras
From our previous host is a post on managing structural genomics data.
Out of left field, Christine Herron writes about a favorite subject of these parts, open data standards. While there is no mention of biology, there is much to learn from such discussions.
We end with a jolt from Italy from C. Maria Keet (who I think is Dutch … it’s wonderful how our field is so truly international). She writes about the synergies between computer science and engineering, biology and medicine and a seminar series organized at her university. I like the fact that she brings a philosophical and ethical compass to the discussion as well. Thanks to Pedro for sending this my way.
Next month, Bio::Blogs will be hosted by Sandra Porter at Discovering Biology in a Digital World. I must say that her editorial is eagerly awaited.
Technorati Tags: Bio::Blogs, Blog Carnival, Computational Biology, Bioinformatics, Scifoo, HIV, Evolution, Science, Life Science, Data, Data Standards
powered by performancing firefox
Structure prediction has a long way to go - The PDB says “no” to computational models
August 27, 2006
I read this first in Bioinform, where it was reported that from October 15, 2006 the PDB will no longer accept theoretical models. The decision was the result of a workshop held last fall, the results of which have been published in Structure
As someone long involved with protein structure prediction, I should be apalled at this slight, but that is not the case. Kevin Karplus states the obvious when he states that the PDB should focus on quality and not on quantity, the latter somewhat of an offshoot of various genome sequencing projects and the Protein Structure Initiative where homology modeling became an excellent tool to model vast numbers of newly sequenced proteins with unknown structure. While structure prediction has made great leaps in the past decade, with the ability to model very distant homologs, the development of threading techniques, and the wonderful work done by David Baker on ab initio structure prediction, the quality of structures is a far cry from that required for many applications. Fold prediction is a good method for classification, and structure prediction is useful for developing functional hypotheses. The field would benefit greatly from improved methods for loop and sidechain prediction, the former being a serious issue.
It’s clear that computation has played and will have an even greater role to play in structural and functional biology, especially as we try and understand allosteric effects, design better drugs, and evaluate protein-protein interactions. Therefore it is important that the community come together, not in the competitive framework of CASP, but rather in a more congenial, cooperative framework, where the discussion should focus on standards, quality assessment criteria, and a centralized resource for computationally derived structures.
Where do I think the field will go? The late 90’s saw a huge jump in the quality and performance of protein structure predictions. Another quantum jump is required to take the field to the next level. A peer-reviewed knowledgebase linking computed structures to the PDB and other relevant classifications would be a good first step, and it looks like the Protein Structure Initiative might take the lead on this issue. I don’t have the paper, so I can’t confirm how much of the workshop focussed on methods. In terms of methodology, the focus will shift (is shifting) from throughput to quality, with a focus on molecular interactions and an increased use of physical potentials for improving structural quality. Improved treatment of the micro environment of sidechains and methods to generate more native-like conformations will gradually begin to hit the mainstream. Perhaps novel search strategies and algorithms that take advantage on new computer architectures and performance will become a focal point for some research groups. Computers are going to get faster and better. Our methods should adjust accordingly, and perhaps it makes sense to take a tiered approach with the method selection being dependent on the application.
Edit:: I forgot to add this part. One telling sign of the lack of trust in computational models is that whenever people create non-redundant structure libraries for comparative modeling, one of the first things that are tossed out as part of the QC procedure are any PDB entries that are computational.
Further Reading
Managing structural genomics data
Technorati Tags: Protein Structure Prediction, Structural Biology, Protein Structure Initiative, Molecular Modeling, Algorithms, Structural Genomics
powered by performancing firefox
Things I noticed - Edition Cuatro
August 26, 2006
Another week and another edition of Things I Noticed. A week that saw the demotion of Pluto and a rejection of the Fields medal . So what’s been happening in my universe
An electronic database of drugs
The FDA is proposing a new rule that would make the management of drug information significantly more efficient and effective. This system, which includes significant automation and electronic acess of drug records, is long overdue. In addition to the impact it would have on drug developers and healthcare providers, I can see a number of informatics companies adding access to this system as part of their offerings.
MIT + FDA + bioinformatics = drug surveillance + safety
The MIT Center for Biomedical Innovation and the FDA are teaming up to develop new methods to monitor drug safety, specifically for pharmaceutcal products and medical devices already available on the market. The proposed methods will focus on automation of post-marketing surveillance and will use a lot of informatics. These methods were previously developed for monitoring the outbreak of infectious diseases. It is also an excellent example of a project that can bring together bioinformatics and medical informatics (biomedical informatics) to look at problems from both a population and underlying molecular perspective.
Solexa and revenue recognition
Given that Solexa has been in the news a bit lately, this nugget on Genomeweb about their revenue recognition caught my eye. These are some of the growing pains of a young company, but they have to be careful. The rug can get pulled out from under the best of them very quickly. It will be very interesting to see where they are about a year from now.
The return of Sun Microsystems?
Maybe they never went anywhere, but TIGR just announced that they were switching from their aging HP Alpha servers to new machines from Sun. The Sun Fire servers they have are AMD Opteron machines running Linux. Sun definitely seems to be making a mini-comeback under new CEO Jonathan Schwartz. Let’s see if they can sustain it. One should note that the number of servers was only 3 (replacing 15 alphas), so its not like TIGR deployed a 100 node cluster. Apparently one of the winning points for Sun was its excellent relationship with AMD and apparently Sun’s non-profit friendly attitude.
Nanostructures for chip-based assays
In the first of a few nanotechnology related stories, NanoInk received an SBIR grant to develop nanostructures for improved chip-based assays. NanoInk, which uses Dip Pen Nanolithography technology developed by Chad Mirkin and co-workers at Northwestern University to design nanostructures for a variety of applications. DPN is an example of a nanotechnology that is fast approaching mainstream acceptance.
… while nanowires detect signals from a single cell
I was fortunate to see Charles Lieber give a talk on this subject very recently. His group at Harvard University has developed silica nanowires that can measure signals from a neuron. The multiplexed nanodevices that Prof. Lieber’s group develops are among the furthest ahead in detecting signals from one (usually a few) cells. He has a vision of developing inexpensive, ultra-sensitive detection systems which will give his children and others a better quality of life and, is leading the way along with a few select others to make nanotechnology a necessary part of information collection in biology.
… but they still can’t define nanotechnology
I believe that the headline says it all
All about scifoo
Nature is carrying an editorial on the recently held Scifoo camp, an offshoot of Tim O’Reilly’s foo camps. Some day, I would love to participate in an event like this, if for no other reason than to find out what some of the leading thinkers in a field talk about. At this one, it was no surprise that peer review was a big part of the debate. We all await more detailed reports on the discussions that were held
Amazon’s Elastic Service
Amazon has released a new service called EC2 that provides “resizable compute capacity”. The target market is web-scale computing for developers. This is an extension of Amazon’s push into scalable services, originally started with the S3 service. I wonder if any bioinformaticians, who do like web services, will take advantage of Amazon’s offering. You can read more about this service here
Niche job boards
37Signals did it, Techcrunch too, even Performancing, so it should come as no surprise to find that GigaOM is jumping into the niche job boards space. Given that bioinformatics, computational biology, molecular modeling, computer-aided drug design, etc are all “niche” jobs, perhaps some day, bloggers/startups in this space will also pursue something in this space.
… and all innovators should remember
Kathy Sierra’s wonderful advice … “Assumptions have a Sell By date”. In other words, we should all be constantly challenging assumptions, others as well as our own.
,
Technorati Tags: Amazon, Nature Magazine, Scifoo, EC2, Innovation, FDA, MIT, Nanotechnology, Science, Nanomedicine, Chad Mirkin, Charles Lieber, Genomics, Solexa, Weekly Roundup
powered by performancing firefox
Bio::Blogs #3 - The final call
August 25, 2006
Bio::Blogs edition 3 is due at my blog on September 1. This is the final reminder. In case you don’t remember, here are the rules
You can submit a blog entry of your own, or of one that you’ve read and enjoyed. The only “rules” are: it must be from a blog, it should be recent and the topic should be in the broad area of bioinformatics or computational biology. As always, we’ll divide the submissions into conferences, primers/reviews and blog articles. I might throw in some surprises as well.
Send your submissions to bioblogs [at] gmail.com. Since I am not sure the entries are reaching me, please copy info [at] mndoci.com. Spread the word!!!
powered by performancing firefox
Biology, computing and web services - Past, present and future
August 22, 2006
The O’Reilly radar is a rather interesting blog to monitor. Amidst all the Web 2.0 and other IT posts, one finds post like this one about web services for bioinformatics. Specifically, they point to an article by Prof. Rick Stevens of U. Chicago and Argonne National Labs. The article is part of the August issue of the quarterly, CTWatch, an issue focussed on Trends and Tools in Bioinformatics and Computational Boilogy.
In the introduction, Stevens writes that the most significant trend in modern biology is the “increasing availability of high-throughput data”. With the sequencing of numerous genomes and the development of new “-omics” techniques, there has been an explosion in the amount of biological information. As the article points out, the challenges of generating integrated datasets of suitable quality are critical and are here to stay for the foreseeable future. I also find it refreshing that he talks about simulation and modeling as part of the whole challenge of computational biology (an aspect often overlooked). While truly predictive modeling is still some years away, computer models at the protein structure level are used everyday in academic and commercial research to make informed decisions, often with extremely high quality results. In a few years, assuming computers become more powerful, organism level modeling at multiple temporal and spatial scales will become increasingly predictive and more prevalent (Stevens predicts a 10-20 year timeframs for complex eukaryotes, a number that seems reasonable).
Large-scale, easily accessible, computing is going to play a huge role in the development of the appropriate methods for biological research. Any success of predictive computer modeling at the organism level will depend on the quality of the algorithms and how easily they can be accessed by researchers. Left in the the hands of the priveleged few the field will flounder. As Rick Stevens suggests computational infrastructure in the future will be directly connected to experimental infrastructure and there will be a feedback look between the two, increasing the quality of the interpreted results. The article goes on to talk about some of the notable successes in applying cyberinfrastructure to biology, e.g. the services developed by NCBI and a discussion of roadmap projects (IMO still lacking). He does not spend too much time on one of the more interesting topics, grids, a subject that probably merits a paper of its own (and one on which I have some strong opinions).
The part that caught O’Reilly Radar’s interest is the section on web services. Web services are going to have a profound impact on biology. From the very beginning, whether it be early versions of Entrez or various meta servers for protein structure prediction, to the comprehensive services available on NCBI today or internal services at pharmaceutical companies, the web has played an important role in accessing biological information. Today it is clear that the web is poised to becoming the dominant route to accessing biological information and executing computational interrogation of biological data. I am not sure I agree with Prof. Stevens that this trend of building web services has not been generally demonstrated. Organizations like IBM and SAIC are using web services to integrate data for their clients, while applications like Taverna, Pipeline Pilot and KDE allow people to develop and publish web services in a highly scalable manner. Of course, the ultimate generalization will occur when biological data can be searched using a service from Google, something that the O’Reilly commentary picks up on. Larry Page has been quoted on Google’s interest in biological data and this is something that the company is already working on. With their vast computational resources and ability to tap some of the leading minds in genomics, I am sure that the company, if it wants, is capable of building out capabilities superior to those available via NCBI. The questions that arise are those of privacy, data sources, ownership, monetization (if any) and others that come up when genomic searching becomes as simple as typing out a query in a search box in your browser.
The article ends with a wonderful discussion on petascale computing, the challenges and problem space that lies ahead, and the kind of techniques and resources that will be needed to address those problems. Many of those are near and dear to my heart. I can only hope that biologists, chemists and computer scientists can come together to develop algorithms, data standards and visualization systems that help scientists and engineers address many of these problems.
Technorati Tags: CTWatch, BioIT, Cyberinfrastructure, Scientific Computing, High Performance Computing, Web Services, Bioinformatics
powered by performancing firefox



