So today, I tried to download MODELLER which is free for academics and $$$ for commercial via Accelrys (Full Disclosure: While i did not directly manage MODELER at Accelrys, I had indirect responsibilities). I completely understand that part. The problem is that the MODELLER license does not seem to address what I want to do: hobby science. So I had to wait for my request to be approved, which it didn’t.
There’s two thoughts that arise from this exercise, or maybe three. First, it’s clear that when the MODELLER license was written, personal research use was not considered. It harks back to the assumption that “real science” was either done in industry or at companies. Well folks, it might have been true some years ago, but it is an assumption that is a bit of a problem. I completely understand that they are trying to avoid the system from being gamed, but in my mind the old model (free for academic, $$$ for commercial via another entity) does not work as classically constructed in this case, for multiple reasons. The whole licensing model does not work for bursty science either, especially when one or more non-academics is installed (this is a question that I took a hard look at once for MD programs).
Which leads me to thought #2. I come from an era when modeling software was local, either on a workstation or on a cluster somewhere. That’s how I always ran CHARMM, MODELLER, WHATIF, various threading packages, MOPAC, GAUSSIAN, various other QM packages. That is how most people run those codes today. Then think about a project that you might want to do, a bursty project spanning geeks across countries and continents. Yeah, modeling doesn’t live well on the programmable web. There are servers out there, especially for structure prediction, sequence alignment, etc, but they seem to belong to a different era of the web. We need to start thinking about the source hosted model, at least for academic code. Source code licenses that target developers and power users that like tinkering with the code, but that’s also better done by hosting all academic code at sourceforge, google code or github, so that collective intelligence comes into play, rather than people developing their own forks which no one else gets to see. Second, applications should be available on the web, ideally with APIs that make it possible to mash up solutions. Now, automating these tasks is not always trivial, neither is setup. All of us with hundreds of utility scripts know that, but lets think about the web when we develop code. Not just providing a web server, but how that server can be used as a powerful resource, not just a result submission and retrieval backend. I’d love to be able to get access to a NAMD server, run a series of utility tasks and then launch a compute job, where I could dial up a set of servers, etc. It’s also possible to attach utility based licensing and pricing to such a service.
What I am arguing for is new ways to think about how we make software available, and how it is used. This can’t be done at the individual group level, but there is an opportunity here for universities and funding agencies to figure out how they can help facilitate this, and even companies that might want to commercialize some of these packages.
Comments on this? What would you like to see? How might you access such tools? Would you want mashup APIs?
Technorati Tags: molecular modeling, web services




5 Comments
So you want to host compute intensive open-source software in the cloud and charge by the cycle. That sounds like it could be a nice sweet spot in between web2.0 everything is free and pricey per-seat licensing fees.
My alternative solution would be to make academic affiliation more accessible. OpenUniversity 2.0 perhaps?
The nice thing about NAMD is that it can use your GPU as computing resource (http://www.ks.uiuc.edu/Research/gpu/). However, this only works when you have GPGPU (nVidia CUDA, http://www.nvidia.com/object/cuda_home.html) enabled graphic cards. These cards range from less than $100 to more than $1000. So I think the big question is, are users more likely to setup their own system. This could even be using their own desktop computer and just wait till it's done. Or, is there a demand for a paid service that provides access to such (clusters of) machines.
That's a great question actually. I had a long discussion on this issue at Bio-IT World. It totally depends on your needs. I fully agree with Chris Dagdigian in that small clusters are essentially dead. We will use desktops/workstations for some tasks (multicores, accelerators etc) and dial up extra CPU cycles when required from Amazon etc. That's the best part about AWS. If I want to dial up a 100 CPUs I can and only need to pay for the time that I am using those cycles.
That's a great question actually. I had a long discussion on this issue at Bio-IT World. It totally depends on your needs. I fully agree with Chris Dagdigian in that small clusters are essentially dead. We will use desktops/workstations for some tasks (multicores, accelerators etc) and dial up extra CPU cycles when required from Amazon etc. That's the best part about AWS. If I want to dial up a 100 CPUs I can and only need to pay for the time that I am using those cycles.
One Trackback
[...] listen, but my one man movement against scientific web resources is going to continue, both for the access models as well as the funding models. Our two latest examples are the Robetta server and the SDSC protein [...]