Map Reduce on Heterogeneous Multi-Core clusters
Wednesday, April 8, 2009, 10:00am - Wednesday, April 8, 2009, 11:30am
ITE 325
We have extended the Map Reduce programming paradigm to clusters with
multicore accelerators. Map Reduce is a simple programming programming
model designed for parallel computations with large distributed datasets.
Google has reinforced the practical effectiveness of this approach with
over 1000 commercial Map Reduce applications. Typical Map Reduce
implementations, such as Apache Hadoop exploit parallel file systems for
use in homogeneous clusters. Unfortunately, the multicore accelerators
such as Cell B.E. used in modern supercomputers such as Roadrunner require
additional layers of parallelism, which cannot be addressed from parallel
file systems alone. Related work has explored Map Reduce on a single Cell
B.E. accelerator machine using hash and sort based techniques. We are
incorporating techniques from Apache Hadoop as well as early multicore Map
Reduce research to produce an implementation optimized for a hybrid
multicore cluster. We are evaluating our implementation on a cluster of
24 of Cell Q series nodes, and and 48 multicore PowerPC J series nodes at
the Multi-core center at University of Maryland Baltimore County.