Programming with Hadoop - A Hands On Introduction

by

Tuesday, September 20, 2011, 10:30am - Tuesday, September 20, 2011, 12:00pm

ITE 325b

In this week's meeting we will dive right into writing MapReduce programs, and we skip all the gory details about Hadoop setup and MapReduce theory. In one hour, we will write a MapReduce Java program using Eclipse to create an inverted-index, test it on a local box, and run it on an already set up Hadoop cluster. If we have time, we will also see how to do the same using Python instead of Java.

You are encouraged to do the following before the meeting if you want to code along.

  • Review the Yahoo Introduction to MapReduce tutorial
  • Download a free virtual machine image with Hadoop pre-installed, so you can get started quickly. Options are available for Linux, Windows and Mac OS X.
  • Make sure you have JDK 1.6x and Eclipse (or your favourite IDE) installed on your laptop.
Addenda (9/19):
  • If you are planning to code along during the demo, download the latest stable release of Hadoop (0.20.2)
  • Some people have been having problems with Cloudera's 64 bit VM image. If you do, try this virtual machine from Yahoo Developer Network that contains a pre-installed hadoop 0.20.
  • Even if you are not able to get the VM running for now, you can still run the program(s) locally on your laptop using Eclipse.

Tim Finin

OWL Tweet

UMBC ebiquity