
- Image by ynse via Flickr
I’ve been meaning to write about Freerisk.org for a while now, but only got reminded yesterday as I read the Wired article about Toby Segaran’s (and Jesper Anderson’s) new project.
Freerisk.org sucks in financial data from the SEC using the XBRL format, allows the community to add additional annotations, and then makes that data available to standard risk analysis algorithms and, this is the best part, available for others to apply their own algorithms. My first reaction was, this is what we want to be able to do in bioinformatics. Keep the data available, add annotations, and have this sandbox in which algorithms can be applied and developed.
The finance geek part of it is interesting enough, but I got interested in Freerisk for the general idea, especially coming from a field where there is a lot of data publicly available but not necessary sandoxes/platforms for analysis and testing out new algorithms, although there is a lot of intent. From the about page of Freerisk.org
Freerisk is a project with the goal of making freely available the data, algorithms and tools necessary to perform risk modeling. We believe that risk management is too important to society to be an arcane subject or competitive advantage.
You could easily replace “risk management” with biology or genomics, or something similar.
The pieces that Freerisk contains are even more interesting
- An open repository of financial data, including financial statements for public companies
- A standards-based API for querying financial data
- A distributed method for designing and running risk models
- Open-source tools for parsing and handling financial data
- Educational materials on risk-management
This is a hackers playground. We need something like this in the informatics community, especially as our data volumes grow. It’s just an ethos that we seem to lack in general, and part is due to the fact that we need to publish our data, but there is a broader community of analysts and developers this could appeal too. Resources like these are needed, not just for finance, but in many other areas. The key is to find enough interested people to contribute. We have some aspects in the bioinformatics space, but it’s somewhat fragmented and the analytics part is the weakness at this point.
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_b.png?x-id=38af7ae7-5559-47c1-aae8-ff847e5d9606)

![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_b.png?x-id=aa497095-c23b-4062-901a-2db7e97b1c95)
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_b.png?x-id=74875b78-bfc6-452d-a500-cef673fe8796)
The big machines
So the latest Top 500 list is out, so why doesn’t it excite me as much as it used to. Well partly cause many of those machines are not easily accessible, while other computing resources are within reach. Perhaps partly because for a lot of the work I am interested in doing, you don’t really need a machine in the top 500. Of course, having access to a machine there allows you to address some problems you couldn’t any other way, and IMO they should only be used for such problems.
One of the better posts about this years list comes from Chris Peters at Intel. He presents a different perspective on the list and notes some trends. For example, the 10th fasted machine on this years list drives more FLOPS than all 500 machines on the 2000 list.
While the post has a definite Intel angle to it, Chris notes the point I made earlier. Today, massive computing resources are available a lot more easily, you have new software stacks, whether for clustering, or for massive data-intensive computing. Personally, I think how we consume computing and the nature of our compute codes is going to go through a transformation in the next decade and more people are going to be doing large scale computing and solving interesting problems.
Will the Top 500 list become meaningless? Not really. There is always room for massive floating point performance and certain problems for which you just need the kind of raw horsepower that the big iron provides. For others, we have a lot of resources that we can get our hands on.
Related articles by Zemanta