Fork me on GitHub

Calais, Reuters and the changing value of data

As mentioned a few times in the past, I really enjoy listening to all the interviews that Paul Miller and Danny Ayers conduct for Talis. One I heard recently was an interview with Barak Pridor, CEO of ClearForest (whose Gnosis Firefox plugin has been covered in the past). In the interview Barak talks about Calais. Calais is a web service that automatically attaches rich semantic metadata to submitted content. I am planning to try it out (it returns RDF) with content from bbgm at first. The hope is to create a graph between people and organizations listed on bbgm. Anyway, in the interview Barak says something that resonated quite a bit (not too surprisingly). His words were along the lines of

Value is shifting from raw data/content to analysis and tools built on top of the underlying content

This is one of the central theories of the bbgm philosophy. It’s good to hear an organization like Reuters (ClearForest was acquired by Reuters last year) adhere to that philosophy.

Technorati Tags: , , ,

This entry was posted in Semantic Web, Software & Internet. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

7 Comments

  1. Posted March 15, 2008 at 02:46 | Permalink

    Hmm, now this looks interesting, and possibly closer to what we want than twine for our purposes. Our big issue at the moment is how to take our ‘semi-semantic’ (We desperately need a better term than that) material from the LaBLog and actually capture a snapshot of the embedded information in a more formal framework. This looks like it might do the job but will have to look closer.

  2. Posted March 15, 2008 at 13:11 | Permalink

    Twine is (at least not yet) not a developer platform, and is more consumer focussed. Calais on the other hand is a text analytics platform for entity extraction. I am not sure it works for the kind of information you want to pull out, but you should check out out.

    Of course, you could always try Freebase as well

  3. Posted March 15, 2008 at 16:11 | Permalink

    Twine is (at least not yet) not a developer platform, and is more consumer focussed. Calais on the other hand is a text analytics platform for entity extraction. I am not sure it works for the kind of information you want to pull out, but you should check out out.

    Of course, you could always try Freebase as well

  4. Posted March 16, 2008 at 05:28 | Permalink

    Yes, having had a look at the Calais site it seems to be a beefed up version of the ClearForest web service. What seems to be missing to me is the ability to extract partially structured data from web sites. Not recognising people or organisations but recognising things that have been structured as key and value. We have a lot of structured information in the lab blog but extracting it is not straightforward.and then presenting it is a whole other problem.

  5. Posted March 16, 2008 at 08:28 | Permalink

    Yes, having had a look at the Calais site it seems to be a beefed up version of the ClearForest web service. What seems to be missing to me is the ability to extract partially structured data from web sites. Not recognising people or organisations but recognising things that have been structured as key and value. We have a lot of structured information in the lab blog but extracting it is not straightforward.and then presenting it is a whole other problem.

  6. Posted March 16, 2008 at 08:22 | Permalink

    Decided to leave a video comment about Freebase :)

    [viddler_video=49c7b76b]

  7. Posted March 16, 2008 at 11:22 | Permalink

    Decided to leave a video comment about Freebase :)

    [viddler_video=49c7b76b]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

blog comments powered by Disqus
  • Archives

  • Disclaimer

    All opinions on this blog are my own and do not reflect those of my employers, past or present