Hadoop MapReduce2 Optimization in Heterogeneous Cluster -

June 15, 2010

i have configuration:

hadoop: v2.7.1 (yarn)
an input file: size = 100 gb.
3 slaves: each has 4 vcores speed = 2 ghz , ram = 8 gb
5 slaves: each has 2 vcores speed = 1 ghz , ram = 2 gb
mapreduce program: wordcount

how can minimize wordcount execution time assigning small input splits 5 slower slaves , big input splits 3 fastest slaves?

for each machine can determine number of map/reduce slots, if want send less workload slower machines can define, example 2 map/reduce task slots each slower machine , 4 map/reduce task slot each of fast machines. way can control how work load each different node in cluster receives.

Search This Blog

Two

Hadoop MapReduce2 Optimization in Heterogeneous Cluster -

Comments

Post a Comment

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

android - Keyboard hides my half of edit-text and button below it even in scroll view -

css - Make div keyboard-scrollable in jQuery Mobile? -