Fork me on GitHub

Search, Bayesian filters and the IF statement

On episode 28 the Stack Overlow podcast Joel Spolsky talked about something Adam Bosworth mentioned to him after Bosworth moved from Microsoft to Google. Apparently Bosworth said something along the lines of “Google uses Bayesian filters like Microsoft uses the IF statement”.

The discussion went on to discuss how that mentality allows Google to always think in the mindset where they know that there are a million things being returned from search, but these are the ten that the person might be looking for. That got me thinking about life science search. When we are looking for a paper, or some scientific topic, what are we looking for? What do we expect back?

The challenge here is context. The same query under different contexts might/should return different results. In the absence of knowledge about context, we really need to think about presentation and allow people to drill into their context. Can we guess some of the more common contexts?

Not sure anyone gets it quite right, but I did dig what NextBio is doing. I did a search for my advisor. Now of course, he does research on various topics, but the search engine does one cool thing; it puts a tag cloud up on top, which allows you to drill down into various subjects. For example, you can click on “retinyl”

nextbio_tagcloud
and I got the result I was expecting
nextbio_filter
Moral of the story. Simple UI steps can go a long way, but we really need to start doing the kind of inference that Google does. We don’t have as much data to go with, but there’s enough floating around. Thoughts?
Reblog this post [with Zemanta]

This entry was posted in Informatics, Search. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

2 Comments

  1. Posted November 12, 2008 at 13:56 | Permalink

    I wish I was made to learn concepts such as Bayesian statistics and Maximum Likelyhood

  2. Posted November 12, 2008 at 18:56 | Permalink

    I wish I was made to learn concepts such as Bayesian statistics and Maximum Likelyhood

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

blog comments powered by Disqus
  • Archives

  • Disclaimer

    All opinions on this blog are my own and do not reflect those of my employers, past or present