Some months ago, I had written about Wikipedia and science. In that post, one of the questions was whether Wikipedia was the appropriate resource for scientific expertise, rather than a general resource.
Naturally, having had that line of thought before, and being a supporter of initiatives like the Encyclopedia of Life (still waiting for it to come online), I was quite intrigued when someone pointed out the Genes Wiki effort on Wikipedia. Like the Proteins Wiki, hosted on Wikia, this is an effort to catalog information about biological entities. According to Andrew Su, who is spearheading the Genes Wiki effort
The goal is to create gene stubs for every gene in the human genome. The stubs will have some minimal amount of structured content – links to important databases, GO annotations, PDB structures, etc. It’s our hope/expectation that these stubs will then seed contributions from experts in the field, specifically the “free-text†and unstructured sort of knowledge for which there really isn’t a great resource available.
IAs I interpret it, the goal is to create a knowledgebase around human genes. My first reaction as I looked at this was that this would be a great Freebase project. Since Freebase allows you to add structure where there is none, combining a Wikipedia resource with the structure/capabilities of Freebase would be quite powerful. The powerful query capabilities that Freebase provides are what makes this especially intriguing in my eyes
The second question I have is whether Wikipedia is the right place for such a project. I am still torn. To some extent, I prefer seeing a master page in Wikipedia, say a page for “gene” and a link to a dedicated Wiki or similar resource for a “Gene Catalog”, perhaps even one on Wikia. Of course, this project is now well under way, so it might be difficult to do that (currently there are close to 7000 stubs). Here’s an example, the oh so well known Sonic Hedgehog. When I look at that page, it almost screams for a dedicated site, with embedded, interactive protein viewers, chromosome viewers, etc, a true mashup.
What the project does point to is the power of the web today. With the resources and tools available to us, it is possible for one or more people to set up such an effort, which in the past would have been next to impossible without a major grant of some sort. Of course, the Wiki structure is ideally suited to getting contributions from the wider community. Perhaps I should drop by the Rhodopsin page.
Technorati Tags: Genes Wiki, Wikipedia



6 Comments
I believe this could be done just as easily using the semantic extentions to MediaWiki, aka Semantic MediaWiki. This also provides better support for established formats such as RDF and OWL (although the OWL was a bit flaky the last time I tested it).
A gene catalog implementing Ontogene etc. in a Semantic MediaWiki could be a powerful resource.
I believe this could be done just as easily using the semantic extentions to MediaWiki, aka Semantic MediaWiki. This also provides better support for established formats such as RDF and OWL (although the OWL was a bit flaky the last time I tested it).
A gene catalog implementing Ontogene etc. in a Semantic MediaWiki could be a powerful resource.
Which reminds me. I really need to take another look at the Semantic extensions. Thanks
Which reminds me. I really need to take another look at the Semantic extensions. Thanks
There were two primary considerations that went in to choosing WP instead of Freebase (I think equally applies to Semantic MediaWiki which I wasn’t aware of). First, obviously there is a huge community at WP, and critical mass is necessary for any community knowledge project to get off the ground. (Case in point, the other one-off gene wikis that apparently are stagnating.)
Second, whereas the primary emphasis of RDF seems to be getting structured data out, our primary concern was getting data in. WP makes that process about as easy as it can be. Remember we’re dealing with experimental biologists here, so one can’t overestimate the sophistication of the average potential contributor. The onus for metadata annotation should not be on these contributors. The simpler we make the process of contributing, the more of “the long tail†we’ll be able to harness.
(Of course I see the value of getting structured data out, but I think there’s room all along the spectrum for different types of systems. The Gene Wiki at WP is at one extreme…)
This is based on my current (limited) understanding of open data efforts… I’m open to being educated.
(For those who are interested, the list of all gene pages created/amended in our effort: http://tinyurl.com/23c2rl)
There were two primary considerations that went in to choosing WP instead of Freebase (I think equally applies to Semantic MediaWiki which I wasn’t aware of). First, obviously there is a huge community at WP, and critical mass is necessary for any community knowledge project to get off the ground. (Case in point, the other one-off gene wikis that apparently are stagnating.)
Second, whereas the primary emphasis of RDF seems to be getting structured data out, our primary concern was getting data in. WP makes that process about as easy as it can be. Remember we’re dealing with experimental biologists here, so one can’t overestimate the sophistication of the average potential contributor. The onus for metadata annotation should not be on these contributors. The simpler we make the process of contributing, the more of “the long tail†we’ll be able to harness.
(Of course I see the value of getting structured data out, but I think there’s room all along the spectrum for different types of systems. The Gene Wiki at WP is at one extreme…)
This is based on my current (limited) understanding of open data efforts… I’m open to being educated.
(For those who are interested, the list of all gene pages created/amended in our effort: http://tinyurl.com/23c2rl)
4 Trackbacks
[...] for Radiology, links to a post about a “Gene Wiki” project, from which I re-find the excellent blog by Deepak Singh. From there I find this interesting resource: FreeBase, which is different that Wikipedia (it [...]
[...] will work for this kind of science. I’m not sure Wikipedia is the venue for this (a similar concern voiced by BBGM) or will be able to bring the level of completeness and accuracy required by scientific research. [...]
[...] Overflow podcast GDGT Podcast FLOSS Weekly Dan Ingalls Net at Nite E. Coli Hub Topsan Java Posse The Genes Wiki project Matt [...]
[...] on using wikis for annotation, specifically around the Genes Wiki project. In the past I have been somewhat skeptical about using Wikipedia as the host for such an effort. I am not so sure about that skepticism anymore, especially if one [...]