Category Archives: Life Science
Supercomputing Masterclass – A request for information
I have been invited to give a Masterworks talk on Data Challenges in Genomics for Supercomputing 09. I would like to dive into the details about the technical and scientific challenges of high throughput genomics, from microarrays to next gen sequencing and beyond and how we need to be manage these data more efficiently. [...]
Also posted in BioIT, Computing, Event, Informatics, Omics Leave a comment
TrendingTopics.org: A reference site for data analytics in Hadoop and Hive
In episode 21 of Coast to Coast Bio (not yet released) I talk about Hive. For those who may not know, Hive is a data warehouse infrastructure built on top of Hadoop.
One of the most recent Amazon Public Data Sets is a sample of Wikipedia page stat statistics by Peter Skomoroch. The full data [...]
Also posted in Big Data, Computing, Informatics Leave a comment
High scale design patterns (missing) in the life sciences
I’ve written about software failures in the past. As I get a better understanding of scale and architectures and talk to others about some of the core design principles of systems at scale, e.g. Recover Oriented Computing (also see this talk by James Hamilton), I realize how little most of us in the life science [...]
Also posted in BioIT, Computing Leave a comment
When Whole Genome Sequencing becomes passe
In a recent blog post at MassGenomics talking about the recently published sequence of a Korean individual, Dan Koboldt makes an interesting observation. He notes
This week’s publication of the genome of a Korean individual in Genome Research marks the fifth individual whole genome sequenced with massively parallel sequencing platforms. The fact that this [...]
Also posted in Omics, Publishing Leave a comment
Hundred nanoseconds a day