That happens when too many fetch failures happen on a specific reducer task node.
Three attributes could be check for this issure:
- mapred.reduce.slowstart.completed.maps = 0.80
allows reducers from other jobs to run while a big job waits on mappers
- tasktracker.http.threads = 80
specifies number of threads used by the reducer task node to serve output from mapper
- mapred.reduce.parallel.copies = sqrt(#of nodes) with a floor of 10
number of parallel copies used by reducers to fetch map output
No comments:
Post a Comment