Fork me on GitHub

Recommendation: Data-intensive text processing with MapReduce

Staying on my massive data processing theme here is a more practical post. In the world of large scale distributed processing, the original MapReduce paper will probably hold the most important position. Hadoop remains the most well known of all the MapReduce implementations, and is now a proven, battle-tested commodity. Tom White’s book
is a great place to start if you have an interest in the framework itself, but the book I wanted to point out was Jimmy Lin’s book on Data-Intensive Text Processing with MapReduce (there is a pre-production PDF of the book from the homepage)  and it’s a great dive into algorithm design. The book talks about general algo design, indexing, graphs and a fabulous section on expectation maximization that is a must read for bioinformaticians who might be interested in analyzing and processing large data sets.

This entry was posted in Big Data, Computing, Programming. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

2 Trackbacks

  1. [...] Deepak Singh: Recommendations: Data-Intensive Text Processing With Mapreduce – “Tom White’s book is a great place to start if you have an interest in the framework itself, but the book I wanted to point out was Jimmy Lin’s book on Data-Intensive Text Processing with MapReduce (there is a pre-production PDF of the book from the homepage) and it’s a great dive into algorithm design.“ [...]

  2. By A library for information retrieval on August 21, 2010 at 20:27

    [...] Data-intensive text processing with MapReduce [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

blog comments powered by Disqus
  • Archives

  • Disclaimer

    All opinions on this blog are my own and do not reflect those of my employers, past or present