Hive job fails with RegionTooBusyException

Symptoms

Error:

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 135 actions: RegionTooBusyException: 135 times, 
	at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:448)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:133)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:170)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
	... 9 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 135 actions: RegionTooBusyException: 135 times, 
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:798)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
	at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:644)
	at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:676)
	at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
	at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:436)
	... 15 more
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 135 actions: RegionTooBusyException: 135 times, 
	at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:234)
	at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(AsyncProcess.java:214)
	at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1751)
	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:208)
	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.doMutate(BufferedMutatorImpl.java:141)
	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:98)
	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1011)
	at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:146)
	at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:117)
	at org.apache.hadoop.hive.ql.io.HivePassThroughRecordWriter.write(HivePassThroughRecordWriter.java:40)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:764)
	... 22 more

 

Diagnosis

This error usually happens because of a failure to acquire a region lock, or the region memstore is above limit and cannot keep up with the load.

 

Solution

Raise the values of the properties below on the cluster:

hbase.ipc.client.call.purge.timeout (default 120000)
hbase.rpc.timeout (must match hbase.ipc.client.call.purge.timeout value)
hbase.hregion.memstore.block.multiplier (default is 4 max is 8)
Version history
Revision #:
5 of 5
Last update:
‎11-09-2017 05:51 AM
Updated by: