Hadoop MapReduce2 Optimization in Heterogeneous Cluster -


i have configuration:

  • hadoop: v2.7.1 (yarn)
  • an input file: size = 100 gb.
  • 3 slaves: each has 4 vcores speed = 2 ghz , ram = 8 gb
  • 5 slaves: each has 2 vcores speed = 1 ghz , ram = 2 gb
  • mapreduce program: wordcount

how can minimize wordcount execution time assigning small input splits 5 slower slaves , big input splits 3 fastest slaves?

for each machine can determine number of map/reduce slots, if want send less workload slower machines can define, example 2 map/reduce task slots each slower machine , 4 map/reduce task slot each of fast machines. way can control how work load each different node in cluster receives.


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -