I am sure many of you are familiar with Wikia, a for-profit project built on the MediaWiki platform and started by Jimmy Wales amongst others. The goal of Wikia is to set up vertical wiki’s around specific subjects. While browsing the site, I chanced upon the Proteins Wiki. Here is what the Proteins Wiki is all about
The Proteins wiki is intended as a community-moderated database for proteins. Editors are invited to upload verifiable, but not necessarily published information about their favorite proteins and protein methods. However, citation of published references and external links to other databases are encouraged wherever possible.
A major goal of the Proteins wiki is to foster a collaborative scientific atmosphere in which testable hypotheses about proteins can be developed, e.g., functional annotation on the genomic scale. Some researchers here have been funded by the National Science Foundation to annotate computationally and experimentally a significant fraction of the genome of Arabidopsis thaliana; however, we hope that others will find the site useful and will wish to contribute as well.
I love the idea of using a hosted platform (Wikia and the excellent MediaWiki engine) to develop a resource for proteins. The question I have is this. How do NCBI, EBI and RCSB get involved? Is there a way to scrape those sources as a start to populating Wikia? It would be great if there could be a single page for a particular protein, which pulls in information from other sources, both automatically and manually via the kind of manual editing.one does in a place like Wikia. Your thoughts?
About 7500 articles about proteins can now be found at freebase. I don't know if freebase is becoming a successful project but I'd rather see the data structured in freebase (as it can be exported in a semantic web format / JSON ) than in a wiki.
Auto annotating semi-structured data is not particularly hard. It can be done with any 'bot', such as those written with pywikipedia or any of several perl interfaces.
I maintain SNPedia a wiki based on mediawiki, but not part of Wikia, which uses a mix of bots and human effort to catalog the effects of human genetic variations. I've used a bot to scoop parts of GeneRIF and other resources into the system. I've used the same techniques for several other non-public wikis as well.
The relevant code is remarkably simple. If anyone has similar needs I would be happy to provide code or tips.
The Proteins Wiki
I am sure many of you are familiar with Wikia, a for-profit project built on the MediaWiki platform and started by Jimmy Wales amongst others. The goal of Wikia is to set up vertical wiki’s around specific subjects. While browsing the site, I chanced upon the Proteins Wiki. Here is what the Proteins Wiki is all about
I love the idea of using a hosted platform (Wikia and the excellent MediaWiki engine) to develop a resource for proteins. The question I have is this. How do NCBI, EBI and RCSB get involved? Is there a way to scrape those sources as a start to populating Wikia? It would be great if there could be a single page for a particular protein, which pulls in information from other sources, both automatically and manually via the kind of manual editing.one does in a place like Wikia. Your thoughts?
Further reading:
Wikipedia and science
Technorati Tags: Wikia, Proteins