Image by matthewsim via FlickrSharing your changes is a great post on some of the advantages of using Git (or any distributed version control system). Rich Apodaca has an even more interesting post on using GitHub for chemistry, particularly in the context of revision controlled datasets.
In general, we are getting increasingly interested in leveraging public data resources. Indeed, even in pharma there are people who have a great interest in combining internal data with public data to try and get more relevant results. But perhaps the biggest trend going forward is going to be the development of mechanisms that allow you to fork and remix data, much in the way we have done with code and media. The same paradigms apply, although the mechanisms might vary. The comment thread on Rich’s post is a must read as well. My particular favorite is one by Rajarshi Guha
I definitely like the idea of mashing up databases. It saves a lot of hassle related to hosting, managing, updating the datasets on my own. If everybody had clean, well documented API‘s that would make life much easier
Data distribution and versioning
In general, we are getting increasingly interested in leveraging public data resources. Indeed, even in pharma there are people who have a great interest in combining internal data with public data to try and get more relevant results. But perhaps the biggest trend going forward is going to be the development of mechanisms that allow you to fork and remix data, much in the way we have done with code and media. The same paradigms apply, although the mechanisms might vary. The comment thread on Rich’s post is a must read as well. My particular favorite is one by Rajarshi Guha
Related articles by Zemanta