Hadoop MapReduce2 Optimization in Heterogeneous Cluster -
i have configuration:
- hadoop: v2.7.1 (yarn)
- an input file: size = 100 gb.
- 3 slaves: each has 4 vcores speed = 2 ghz , ram = 8 gb
- 5 slaves: each has 2 vcores speed = 1 ghz , ram = 2 gb
- mapreduce program: wordcount
how can minimize wordcount execution time assigning small input splits 5 slower slaves , big input splits 3 fastest slaves?
for each machine can determine number of map/reduce slots, if want send less workload slower machines can define, example 2 map/reduce task slots each slower machine , 4 map/reduce task slot each of fast machines. way can control how work load each different node in cluster receives.
Comments
Post a Comment