Fork me on GitHub

When more is easier

Public Art at the NKFUSTImage via WikipediaMore goodness from Jeff Jonas. In The Fast Last Puzzle Piece, he talks about how the notion that more data = slower system is not true. The analogy he uses is that of a jigsaw puzzle, which starts easy, gets harder and eventually gets easier again as pieces can only fit in certain specific positions (fewer degrees of freedom in language we are used to). He goes on to add that such behavior needs to fulfill a set of requirements and that’s what caught my eye. Essentially any set of observations must

  • Belong to the same universe
  • Have enough features to enable contextualization
  • Be such that the features can be extracted, enhanced and classified
  • Sufficiently saturate observational space

He adds that you need to have enough smarts to stitch everything together.

As I read that list, I kept thinking of the data we are used to seeing as life scientists. One would think it satisfied all the criteria above, so why are things getting harder? I think it has to do with the point around saturation. In many cases, we don’t have saturation, which is why we can’t get the required results. In others, structure prediction comes to mind, we do have sufficient saturation and we are able to get meaningful results as our body of work grows. However, right now, we haven’t hit that tipping point with a lot of data types that we are in a situation where the system gets “faster” and easier to solve.

What do you think?

Reblog this post [with Zemanta]

This entry was posted in Informatics, Infotech, Life Science. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.
blog comments powered by Disqus
  • Archives

  • Disclaimer

    All opinions on this blog are my own and do not reflect those of my employers, past or present