# 故障描述:java
hive > select substring(request_body["uuid"], -1, 1) as uuid, count(distinct(request_body["uuid"])) as count from log_bftv_api where year=2017 and month=11 and day=1 and request_body["method"] = "bv.lau.urecommend" and length(request_body["uuid"]) = 25 group by 1 order by uuid; # hive 執行該HQL語句時報錯信息以下:( 數據量小的時候沒有問題 )
# 報錯信息:node
MapReduce Total cumulative CPU time: 1 minutes 46 seconds 70 msec Ended Job = job_1510050683827_0137 with errors Error during job, obtaining debugging information... Examining task ID: task_1510050683827_0137_m_000002 (and more) from job job_1510050683827_0137 Task with the most failures(4): ----- Task ID: task_1510050683827_0137_m_000000 URL: http://namenode:8088/taskdetails.jsp?jobid=job_1510050683827_0137&tipid=task_1510050683827_0137_m_000000 ----- Diagnostic Messages for this Task: Error: Java heap space FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 3 Reduce: 5 Cumulative CPU: 106.07 sec HDFS Read: 223719539 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 1 minutes 46 seconds 70 msec
# 緣由分析:shell
報錯顯示 Error: Java heap space、return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask 查資料說是由於內存的緣由,因爲HQL其實是被轉換成mapreduce的java任務,因此作了如下操做。
解決方法:apache
hadoop shell > vim etc/hadoop/hadoop-env.sh # 默認 1000 export HADOOP_HEAPSIZE=4096 hadoop shell > vim etc/hadoop/yarn-env.sh # 默認 1000 YARN_HEAPSIZE=4096 # 跟據實際狀況,按需調整! hadoop shell > vim etc/hadoop/mapred-site.xml <property> <name>mapreduce.map.memory.mb</name> <value>1536</value> </property> <property> <name>mapreduce.map.java.opts</name> <value>-Xmx1024M</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>3072</value> </property> <property> <name>mapreduce.reduce.java.opts</name> <value>-Xmx2560M</value> </property> <property> <name>mapreduce.task.io.sort.mb</name> <value>512</value> </property> <property> <name>mapreduce.task.io.sort.factor</name> <value>100</value> </property> <property> <name>mapreduce.reduce.shuffle.parallelcopies</name> <value>50</value> </property> # 新增這些參數 ( 跟據機器實際狀況,按需成倍調整 ) # 個人這個測試環境是4臺8核8G的KVM虛擬機,一個NameNode,三個DataNode!# 通過此次參數調整,目前600G的數據集上沒出過問題,HDFS 上還在不斷的寫入歷史數據、新數據。