Some time ago I wrote about how more data does not necessarily mean you get slower, channeling Jeff Jonas’ analogy to jigsaw puzzles. He extends some of those thoughts in a recent post on how obervations are accumulated into context. As usual I find a lot of parallels with the practice of data driven science, specifically bioinformatics.
We have an increasing number of fragments of data. Some day, perhaps enough to actually build a complete puzzle. In this latest post, Jeff talks about some of the decisions we need to make. Should we favor the false negative by having tight constraints on when we snap two pieces together? Should we permit a degree of uncertainty and potentially find connections which we might otherwise miss, essentially raising the false positive rate. Jeff argues in favor of an approach which goes for tighter clarity, but even then you often reach a point which suggests that an earlier decision might have been wrong. In the life science world, that happens all the time (and then some). The key point in his argument, which is the part discussed before, is that when our context improves, i.e. when we have more information, the computational challenge decreases, because for all practical purposes you have fewer degrees of freedom.
Jeff also uses a new phrase to describe Persistent Context, essentially a state where the net sum of all previous observations and assertions co-exist. This perhaps sums it up best, especially in a place like biology. The question that I find myself asking is, when do we reach a stage where we can take the inherent scientific relationships in a biological puzzle, the intelligence as it were and let that guide the completion of the puzzle? When do we have enough predictive power?
Context and puzzles revisited
Some time ago I wrote about how more data does not necessarily mean you get slower, channeling Jeff Jonas’ analogy to jigsaw puzzles. He extends some of those thoughts in a recent post on how obervations are accumulated into context. As usual I find a lot of parallels with the practice of data driven science, specifically bioinformatics.
We have an increasing number of fragments of data. Some day, perhaps enough to actually build a complete puzzle. In this latest post, Jeff talks about some of the decisions we need to make. Should we favor the false negative by having tight constraints on when we snap two pieces together? Should we permit a degree of uncertainty and potentially find connections which we might otherwise miss, essentially raising the false positive rate. Jeff argues in favor of an approach which goes for tighter clarity, but even then you often reach a point which suggests that an earlier decision might have been wrong. In the life science world, that happens all the time (and then some). The key point in his argument, which is the part discussed before, is that when our context improves, i.e. when we have more information, the computational challenge decreases, because for all practical purposes you have fewer degrees of freedom.
Jeff also uses a new phrase to describe Persistent Context, essentially a state where the net sum of all previous observations and assertions co-exist. This perhaps sums it up best, especially in a place like biology. The question that I find myself asking is, when do we reach a stage where we can take the inherent scientific relationships in a biological puzzle, the intelligence as it were and let that guide the completion of the puzzle? When do we have enough predictive power?
Related articles by Zemanta