Semantic thoughts
February 3, 2008
A month or so ago, I had a fascinating conversation with someone about the Semantic Web. I am no Semantic Web guru. Far from that, although I hope to be 500% more educated on formal Semantic Web concepts in the not so near future. Nor am I a Semantic Web zealot, someone who’s world fits into a Triple Store. I fully understand that there are many ways for allowing machines to talk to each other, share information, understand context and so on. The challenge lies in building a layer between the underlying plumbing, whether it is more formal Semantic Web concepts or microformats and similar machine-communication methods (the semantic web with lowercase “s” and “w” as some call it). In the recent past, I have talked about entity extraction and simple methods to get machines to speak to each other and find related metadata. However, there are applications where I think the Semantic Web, yes the fully featured Semantic Web, carries a lot of value and one of them happens to be the life sciences, where everything is interrelated. One reason why I believe the Semantic Web is critical to the future of science is probably best characterized by Nova Spivack in a blog post on understanding the Semantic Web where he counters a post by Tim O’Reilly on Web 2.0 vs. Web 3.0. (such labels often only serve to muddy the waters, although they can be very convenient legs to stand on). One of the key points that Nova makes is the following in a section appropriately entitled The real point of the Semantic Web = Open Data (emphasis mine)
From what I can see, Tim thinks the Semantic Web is some kind of artificial intelligence system. If that is the case, he’s completely missing the point. Yes, of course it enables better, smarter applications. But it’s fundamentally NOT about AI and it never was. It’s about OPEN DATA. The Semantic Web should be renamed to simply The Data Web.
He goes on to say
The benefit of Open Data is that it enables databases and the data they contain to be designed, shared, and mashed-up in a totally bottom-up, user-driven, Web 2.0 manner. This is in fact collective-intelligence applied to data.
Now, I am not smart enough to state whether RDF, OWL, SPARQL, etc are the definitive ways of making this open data web most effective, although they are definitely up there in usefulness. The point is that Open Data is effective not only because we want transparency in our scientific data, but also because it allows us to access and mine information in new ways that were hitherto not as convenient. Recently, I have been listening to Talks with Talis a lot, and a common theme seems to be how we can use Semantic Web principles to find relationships in data. In my mind this is way more important than the social web, which has it’s own place, but in the end, while people are important, finding useful information and mining knowledge are what we are (at least some of us) interested in. It’s what we should strive to get to, whether it is via RDF and SPARQL, or some other methods that allow us to build intelligence into our systems based on the underlying relationships between data.
Which brings me to a thought about Freebase, which I have started penning down on the Wiki. I have always wondered what the best application of Freebase would be. The obvious thought was to try and build protein networks by adding structure to existing data, but the more I think about it, perhaps the best application is to use it as a data backend and then build apps on top of it. From Scifoo, I believe Danny Hillis considers such applications to be ideal use of Freebase as well.
Technorati Tags: Semantic Web, Open Data
Comments
4 Responses to “Semantic thoughts”
Got something to say?



Yes… DEFINITELY about Open Data…
…and glad you like the podcasts…
[…] Semantic thoughts, I talked a little bit about my Semantic Web epiphany. In this post, I want to discuss some […]
[…] Semantic Thoughts and Semantic Thoughts #2 […]
[…] Semantic thoughts […]