A middle-out web services infrastructure for systems biology
July 6, 2008
Image via WikipediaOne of the thoughts I’ve always had over the years in life science software was the need to adjust to constantly changing needs and technologies. In the world of “omics” software, where the technologies generating the data, as well as the data types keep evolving, it gets pretty darn complicated to build solid software projects. So, a paper on Systems biology driven software design for the research enterprise naturally caught my interest. The paper, from John Boyle et al discusses the informatics infrastructure being developed at the Institute for Systems Biology. What struck me as I was going through the first part was that there was no mention of RESTful architectures, at least explicitly, which was somewhat of a disappointment.
The architecture they use is a Service Oriented Architecture, pretty much an essential in such systems. There are other examples of service-based systems in informatics, e.g. CARMEN and, as I recently found out, EColiHub. Both I3 (the ISB system) and CARMEN use SOAP-based web services. EColiHub uses REST (not sure if that’s in production yet), so you know which one I am biased towards. It should be noted that in a lot os such systems Taverna and BioMoby are being used or on the cards, so there is a definite move towards a “lets not reinvent the wheel” mindset, a very good thing. In addition to these, there are workflow systems, like Pipeline Pilot, that are ideally suited to developing and deploying service oriented architectures (also usually SOAP based)
The paper definitely highlights many of the challenges that any such system needs to address. It’s a constant challenge, especially when a lot of the underlying scientific methods and technologies are not at the same level of robustness. In a research environment, you can’t really lock down best practices either, i.e. there needs to be flexibility to explore, and allow people to do things their way. The philosophy is seems to be, as is the preference these days in focusing on the middle layer, and allowing people to develop their tools and methods and providing them a common integration environment. Now here’s the part I like. The authors have chosen to use a LSID based system, and it’s pretty easy to see the system being used as a Semantic Web platform.
I am no guru on architecture, and this is a somewhat formal paper, so I won’t necessarily go there. The key aspect for me is seeing a trend, even in academia to at least think about formal software development and think about developing architectures and deployment environments that can evolve and be maintained over a period of time. On the other hand, there is the danger of making things too formal and resource heavy. Is this how I might have done it? Given that the architecture is stateless, I still wish it had been developed under a resource oriented architecture. While I do like workflows, in an academic setting, I am not convinced that they are the right paradigm, and I am not sure they should go there in the future.
The take home message is one that there is a place for deployment frameworks even in a research setting, and for using good, pragmatic software development practices. Matt Wood could probably give you a long lecture on the latter.
How do you think software resources in research settings should be made available?




Add New Comment
Viewing 1 Comment
Thanks. Your comment is awaiting approval by a moderator.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Add New Comment
Trackbacks
(Trackback URL)
July 6, 2008 at 7:47 pm
[...] previous post on web services infrastructures for systems biology covers similar [...]