Microsoft just announced a whole new bunch of new participants for its BioIT alliance and a second proof-of-concept project, this one focused on biomarkers. The announcement got me thinking about the future of BioIT, at least in the medium to long term.
While the volume of data available today seems unmanageable, it is nothing compared to the disparate types and gargantuan volumes of data that will be available faster than most people probably realize. Data will be generated by a number of sourced, including all kinds of omics research, patient records, clinical trials and other sources that I probably can’t even think of right now. An emphasis on pharmacovigilance and a general acceptance of translational medicine will result in the need to correlate data from the most basic of research to the impact of drug treatments several years into a treatment regimen. This means a fundamental change in how we approach data itself. The fact is that even today, cheminformatics and bioinformatics are quite poorly integrated and people pursuing either activity tend to stay in their own silos. Even more challenging is the lack of integration between bioinformatics and medical informatics. As the emphasis of research moves towards understanding systems, these data will need to be readily available and understood not just by expert informaticians, but by clinicians and bench scientists alike. The electronic healthcare system is not yet a reality, but steps in that direction are being taken.
If I were a CIO at a pharmaceutical or healthcare company, my concerns would be to try and build an infrastructure that is ready to adapt as the needs of the users and regulatory industry evolve. On the flip side, vendors (software and hardware) and scientists need to continue to work together to develop appropriate standards. I still feel that the W3C should be involved at some level, since they have shown the ability to maintain and propogate a standard that is more widely used than genomic data ever will be.
Virtualization might be an overused buzzword in IT circles, but its a good one. For most applications, assuming security and privacy are maintained, does it matter where a calculation or search is carried out? In a perfect world, the end user should be concerned only with what he/she needs. Everyone should have access to the same types of data (with varying levels of access). Proprietary data should have the same format as more widely accessible data, making it easy for licensed content to be distributed and accessed. How the data are made available should be dependent on the context. In other words, the clinician and the basic researcher in a biotech company might query the same data source, and might even run similar queries, but what they see might be different as each has different goals. Compute power is no longer the bottleneck in achieving these goals. The bottlenecks and challenges lie in network speeds, security, accessibility and of course building an onotology for all aspects of healthcare, beginning from the genome all the way to patient data.
Which brings us back to the CIO. Contrary to popular perception, the role of information services in the future of healthcare is going to be absolutely critical. The efficiency of entire healthcare networks might depend on how effect information services are. The role of the life science community should be to make their lives a little bit easier.
Technorati Tags: BioIT, Healthcare, Virtualization, Information Services, Data Integration, Translational Medicine, Informatics
powered by performancing firefox



One Trackback
[...] Deepak tells us of the second BioIT project, arguing for inherent infrastructure flexibility to cope with the soon-to-come deluge of data. He points out that compute power isn’t really limiting any more (thanks to clustering), but efficient data management is, especially if you don’t know what next year’s data will look like. Whilst I agree in principle with virualisation, I can’t really see an implementation that would address the unknown data types problem. Maybe it’s just me? Egon mentions Bioclipse, an open-source workbench for bio/chemi-informatics (check out their blog, too). As far as I can see, it’s a point-and-click interface to data management functionality. I can see why this is useful to Joe Schmoe, but I suspect that this class of approach is incompatible with high-throughput data processing. I’m still a fan of CLIs and scriptability. They are, however, looking for use cases, and it behooves us as a bioinformatics community to pitch in. After all, if we don’t, who will? [...]