Map Reduce on Heterogeneous Multi-Core clusters
Wednesday, April 8, 2009, 10:00am - Wednesday, April 8, 2009, 11:30am
We have extended the Map Reduce programming paradigm to clusters with multicore accelerators. Map Reduce is a simple programming programming model designed for parallel computations with large distributed datasets. Google has reinforced the practical effectiveness of this approach with over 1000 commercial Map Reduce applications. Typical Map Reduce implementations, such as Apache Hadoop exploit parallel file systems for use in homogeneous clusters. Unfortunately, the multicore accelerators such as Cell B.E. used in modern supercomputers such as Roadrunner require additional layers of parallelism, which cannot be addressed from parallel file systems alone. Related work has explored Map Reduce on a single Cell B.E. accelerator machine using hash and sort based techniques. We are incorporating techniques from Apache Hadoop as well as early multicore Map Reduce research to produce an implementation optimized for a hybrid multicore cluster. We are evaluating our implementation on a cluster of 24 of Cell Q series nodes, and and 48 multicore PowerPC J series nodes at the Multi-core center at University of Maryland Baltimore County.