The struggle between trying to please your committee and provide adequate support for the software you create remains. Is this a problem with the focus of graduate studies, the funding bodies, or the expectation of users?
These words come from a post by Nils Homer up on Anthony Fejes’ blog. Nils talks about some of the challenges of being a software developer in the world of bioinformatics. In his post he quotes a friend
… the software engineering and implementation of several of the methods consumed significantly more time and energy than the original research and paper writing. This is an important but less recognized component of methods development, as it prevents the work from remaining just interesting ideas, but puts them into practice
For reasons Nils articulates so well, research code is often brittle, poorly documented and not sustainable at all. There are always exceptions, but that is the norm. The world is full of aligners and scripts and small apps with no documentation and the person who wrote the code far removed and unable to provide any help. Nils asks “what are the benefits of creating usable software and to support users who are not the ones provide funding?”
Here I will deviate from the post and provide my views, some of which do match Nils’. The number one benefit is much the same as any experimental procedure, protocol or technique. Software is how computational thoughts and ideas are implemented. Being able to capture, optimize and share such ideas and protocols is good for science. Good, well documented, well supported and well understood software might also result in less software bloat and repeated implementations of exactly the same piece of software. In a perfect world, the software would be open sourced, and a community would develop for it, resulting in improvements to the software over time. For all that I complain about the sometime closed communities around molecular dynamics code, they do benefit the overall functionality and direction of the codes.
But I also think there is place in the life science world for the professional software developer. Someone who can implement algorithms robustly, think about things like database optimization and multi-tenancy, write good UI’s etc. I’ve seen too many examples of this working and strongly believe that while all bioinformaticians should be good programmers today and write robust algorithms, you need software developers, or at least folks whose focus is not on cranking out papers to develop applications, data management systems, robust pipelines, etc. Those are as much part of modern science as the sequence search algo.
It’s good to see programming and code get more attention these days. In a recent blog post following an NHGRI workshop, Sean Eddy wrote
A program that funds developers in much the same way that HHMI or the NIH Pioneer Awards fund people not projects. NHGRI could allocate stable long-term funding to a small but influential number of individual developers. The history of the field is that the best software in the field is often an unplanned labor of love from a single investigator; the history of software development shows that the disparity between the best developers and average ones is enormous, so business studies recommend models that enable highly skilled developers to focus on what they do best; and the best developers are often quirky people who don’t write grants well.
This is the kind of effort we need. As Steve Ballmer once (in)famously chanted Developers, Developers, Developers. We need more, and better ones, or perhaps simple, better appreciated ones.
Related articles by Zemanta
- Bioinformatics needs investment (blindscientist.genedrift.org)
- Reference implementations and education (mndoci.com)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_b.png?x-id=ad482d16-c6e1-49bb-9ad0-0fa995108b9b)



5 Comments
Outsource the boring software development and atleast some if not all of the maintenance?! even at the RA level…
Maintenance is actually the key. Most researchers don't write maintainable software and unfortunately that's rarely a focus.
Maintenance is actually the key. Most researchers don't write maintainable software and unfortunately that's rarely a focus.
I write both research code and application interfaces. It takes much longer to write a good interface than it does to write the core algorithm. However, the time is worth it if an algorithm is going to be used more than once.
I have observed that much research code is not well-written, -connected, -tested, and -documented. Much of the code is ad hoc, and there is also the occasional manual step (data cleaning, sorting, manipulation in Excel files, etc.) I suspect that if the analysis on the data were performed again, the analysis would generate different results because the original analysis could no be repeated step by step.
I write both research code and application interfaces. It takes much longer to write a good interface than it does to write the core algorithm. However, the time is worth it if an algorithm is going to be used more than once.
I have observed that much research code is not well-written, -connected, -tested, and -documented. Much of the code is ad hoc, and there is also the occasional manual step (data cleaning, sorting, manipulation in Excel files, etc.) I suspect that if the analysis on the data were performed again, the analysis would generate different results because the original analysis could no be repeated step by step.
2 Trackbacks
[...] Singh blogs about the need for developers and more support and appreciation for developers and the software they develop. (can I add… documentation, training and user support? which I [...]
[...] Bioinformatics and software development – yet again (mndoci.com) [...]