騰訊雲大數據套件Hermes-MR索引插件使用總結

版權聲明:本文由王亮原創文章,轉載請註明出處: 
文章原文連接:https://www.qcloud.com/community/article/121java

來源:騰雲閣 https://www.qcloud.com/communitynode

 

Hermes是多維分析利器,使用步驟分爲索引建立和數據分發兩個步驟。apache

Hermes目前還沒有集成到TBDS套件(3.0版本)中且外部有客戶須要在本身部署的集羣上使用Hermes組件,這裏就遇到了Hermes與外部Hadoop集羣的適配問題。json

Hermes與某客戶外部集羣集成後,一次壓測時(2T數據量,445604010行,477字段全索引)使用單機版的Hermes索引建立插件因爲數據量過大,出現Out of Memory等異常現象致使索引插件程序崩潰,實際產生的數據索引量和實際數據量差距很大。基於以上考慮,數平提供了基於MR的索引建立插件,提高索引建立效率。api

如下記錄了基於hadoop2.2版本的MR索引插件和外部集羣的適配過程。app

一.集羣相關組件版本

Hermes版本:hermes-2.1.0-1.x86_64
Hadoop集羣版本:Hadoop 2.7.1.2.3.0.0-2557
Hermes-index-MR插件使用的Hadoop-common:hadoop-common-2.2.0.jar框架

二.Hermes-MR插件使用方法

1.需修改配置:(以$HERMES_INDEX_MR_HOME表示插件主目錄)

  • $HERMES_INDEX_MR_HOME/conf/hermes.properties
    修改內容:hermes.zkConnectionString更改成本集羣的zookeeper地址;hermes.hadoop.conf.dir修改成本集羣的hadoop配置目錄;hermes.hadoop.home修改成本集羣的hadoop安裝主目錄。ide

  • $HERMES_INDEX_MR_HOME/conf/hermes_index.properties
    修改內容:hermes.hadoop.conf更改成本集羣的hadoop配置目錄;hermes.index.user.conf更改成hermes-MR-index插件的用戶配置文件絕對地址。oop

  • $HERMES_INDEX_MR_HOME/conf/user_conf.xml
    修改內容:該配置即hermes-MR-index插件的用戶配置文件,通常默認配置項便可。須要注意的是插件支持指定被索引文件的字段分隔符。配置項爲higo.input.record.split和higo.input.record.ascii.split。其中higo.input.record.ascii.split的優先級高於前者,指定higo.input.record.ascii.split後第一個配置將無效。其中higo.input.record.split的value項直接指定分隔符內容(如|,\,;等);higo.input.record.ascii.split指定分隔符對應的ascii碼數字。

2.運行插件

  • 執行命令:在插件主目錄下(其中labcluster爲HDFS的nn經過作HA的名稱):測試

    sh bin/submit_index_job.sh \
    clk_tag_info_test_500 \
    20160722 \
    hdfs://labcluster/apps/hive/market_mid/clk_tag_info_test/ \
    hdfs://labcluster/user/hermes/demo_dir/clk_tag_info_test_500/ \
    hdfs://labcluster/user/hermes/demo_dir/schema/clk_tag_info_test_500_hermes.schema \
    key_id \
    3
  • 參數介紹:
    sh bin/submit_index_job.sh表名 數據時間(時間分區) 源數據在HDFS上地址(單文件或目錄) 索引輸出的HDFS目錄 schema文件在HDFS的地址(需手動建立上傳) 主鍵 索引分片數

3.日誌觀察:

建立索引插件在運行後會在$HERMES_INDEX_MR_HOME/logs輸出hermes.logindex.log。前者爲hermes相關的記錄,後者爲索引建立過程記錄(包括MR任務相關信息)。正常狀況下index.log會記錄提交MR任務成功與否以及相關jobid,可經過HADOOP的RM管理頁面看到狀態,index.log也會記錄Map/Reduce的進度,完成後會輸出Job ${job.id} completed successfully以及MR任務相關信息(如圖)。若是出現錯誤日誌,需具體分析,下文會總結本次集羣適配遇到的一系列問題,目前已在TBDS3.0(Hadoop2.7)集羣裏測試經過。

4.適配基本過程

前面已提到Hermes-MR-index插件使用的Hadoop-common.jar版本爲2.2,但集羣自己爲Hadoop2.7。在直接執行插件建立索引時出現如下「奇怪」異常。

Diagnostics: Exception from container-launch.
Container id: container_e07_1469110119300_0022_02_000001
Exit code: 255
Stack trace: ExitCodeException exitCode=255: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

查詢了全部異常日誌後一無所得,和數平Hadoop大神請教後,建議替換Hermes-MR-index插件裏用到Hadoop*.jar包爲集羣內版本。這樣開始仍是遇到了一系列問題,最終在hadoop2.7環境下Hermes-MR-index插件運行正常。

整理了如下思路進行適配:1.將Hermes-MR-index插件用到的hadoop-*.jar所有替換爲集羣內使用的版本;2.執行插件看日誌錯誤通常會由於新版(2.7)有新的jar包依賴關係,提示錯誤,根據錯誤提示缺乏的類找到對應jar包,添加到$HERMES_INDEX_MR_HOME/lib目錄,重複此操做,直到再也不提示缺乏類錯誤。3.執行以上操做時同時須要注意缺乏的類關聯的jar包的版本必須和實際集羣用到的版本一致(重複步驟2時發現的問題)。

5.問題彙總

插件和集羣的適配過程當中遇到的問題總結以下:

  • 配置項mapreduce.framework.name異常

    2016-07-21 15:39:51,522 (ERROR 
    org.apache.hadoop.security.UserGroupInformation 1600): 
    PriviledgedActionException as:root (auth:SIMPLE) 
    cause:java.io.IOException: Cannot initialize Cluster. Please check 
    your configuration for mapreduce.framework.name and the correspond 
    server addresses.
    Exception in thread "main" java.io.IOException: Cannot initialize 
    Cluster. Please check your configuration for 
    mapreduce.framework.name and the correspond server addresses.

    解決方法:查看集羣的hadoop相關配置(即hermes.properties裏指定的hadoop配置路徑裏配置目錄,也能夠複製集羣的出來,本身作單獨修改)mapred-site.xml裏的mapreduce.framework.name配置項內容爲yarn-tez,但目前插件只支持到yarn,故單獨修改此項配置爲yarn後保存,異常解決。

  • 插件沒法向集羣提交任務

    2016-07-21 20:14:49,355 (ERROR
    org.apache.hadoop.security.UserGroupInformation 1600): 
    PriviledgedActionException as:hermes (auth:SIMPLE) 
    cause:java.io.IOException: Failed to run job : 
    org.apache.hadoop.security.AccessControlException: User hermes 
    cannot submit applications to queue root.default

    解決方法:使用hermes用戶向yarn提交任務時無權限提示。修改yarn集羣的權限容許hermes便可。TBDS3.0有很方便的訪問控制頁面進行操做。

  • 提交任務時變量替換異常

    Exception message:
    /hadoop/data1/hadoop/yarn/local/usercache/hermes/appcache/applicati
    on_1469110119300_0004/container_e07_1469110119300_0004_02_000001/lau
    nch_container.sh: line 9: 
    $PWD:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-
    client/*:/usr/hdp/current/hadoop-
    client/lib/*:/usr/hdp/current/hadoop-hdfs-
    client/*:/usr/hdp/current/hadoop-hdfs-
    client/lib/*:/usr/hdp/current/hadoop-yarn-
    client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-
    framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/common/*:$PWD/mr-
    framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/yarn/*:$PWD/mr-
    framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-
    framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/ha
    doop/lib/hadoop-lzo-
    0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:job.jar/job.jar:job
    .jar/classes/:job.jar/lib/*:$PWD/*: bad substitution
    /hadoop/data1/hadoop/yarn/local/usercache/hermes/appcache/applicatio
    n_1469110119300_0004/container_e07_1469110119300_0004_02_000001/laun
    ch_container.sh: line 67: $JAVA_HOME/bin/java -
    Dlog4j.configuration=container-log4j.properties -
    Dyarn.app.container.log.dir=/hadoop/data1/yarn/container-
    logs/application_1469110119300_0004/container_e07_1469110119300_0004
    _02_000001 -Dyarn.app.container.log.filesize=0 -
    Dhadoop.root.logger=INFO,CLA -Dhdp.version=${hdp.version} -Xmx5120m 
    org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
    1>/hadoop/data1/yarn/container-
    logs/application_1469110119300_0004/container_e07_1469110119300_0004
    _02_000001/stdout 2>/hadoop/data1/yarn/container-
    logs/application_1469110119300_0004/container_e07_1469110119300_0004
    _02_000001/stderr : bad substitution
    Stack trace: ExitCodeException exitCode=1: 
    /hadoop/data1/hadoop/yarn/local/usercache/hermes/appcache/applicatio
    n_1469110119300_0004/container_e07_1469110119300_0004_02_000001/laun
    ch_container.sh: line 9: 
    $PWD:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-
    client/*:/usr/hdp/current/hadoop-
    client/lib/*:/usr/hdp/current/hadoop-hdfs-
    client/*:/usr/hdp/current/hadoop-hdfs-
    client/lib/*:/usr/hdp/current/hadoop-yarn-
    client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-
    framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/common/*:$PWD/mr-
    framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/yarn/*:$PWD/mr-
    framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-
    framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-
    framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/ha
    doop/lib/hadoop-lzo-
    0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:job.jar/job.jar:job
    .jar/classes/:job.jar/lib/*:$PWD/*: bad substitution
    /hadoop/data1/hadoop/yarn/local/usercache/hermes/appcache/applicatio
    n_1469110119300_0004/container_e07_1469110119300_0004_02_000001/laun
    ch_container.sh: line 67: $JAVA_HOME/bin/java -
    Dlog4j.configuration=container-log4j.properties -
    Dyarn.app.container.log.dir=/hadoop/data1/yarn/container-
    logs/application_1469110119300_0004/container_e07_1469110119300_0004
    _02_000001 -Dyarn.app.container.log.filesize=0 -
    Dhadoop.root.logger=INFO,CLA -Dhdp.version=${hdp.version} -Xmx5120m 
    org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
    1>/hadoop/data1/yarn/container-
    logs/application_1469110119300_0004/container_e07_1469110119300_0004
    _02_000001/stdout 2>/hadoop/data1/yarn/container-
    logs/application_1469110119300_0004/container_e07_1469110119300_0004
    _02_000001/stderr : bad substitution

    解決方法:從bad substitution能夠斷定爲是某些配置的參數沒有正常替換形成。查看具體異常裏面用到的變量有$PWD,$JAVA_HOME,${hdp.version}和$HADOOP_CONF_DIR以上變量在hadoop的配置文件裏找到逐個替換爲實際值而不用變量直到錯誤提示再也不出現。實踐中發現是由於hdp.version這個變量沒有值形成的,能夠在hadoop配置裏增長一項此配置或者將用到該變量的地方替換爲實際值便可。

  • 一個「奇怪的」錯誤

    2016-07-22 15:25:40,657 (INFO org.apache.hadoop.mapreduce.Job 1374): 
    Job job_1469110119300_0022 failed with state FAILED due to: 
    Application application_1469110119300_0022 failed 2 times due to AM 
    Container for appattempt_1469110119300_0022_000002 exited with  
    exitCode: 255
    For more detailed output, check application tracking 
    page:http://bdlabnn2:8088/cluster/app/application_1469110119300_0022
    Then, click on links to logs of each attempt.
    Diagnostics: Exception from container-launch.
    Container id: container_e07_1469110119300_0022_02_000001
    Exit code: 255
    Stack trace: ExitCodeException exitCode=255: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java
    :722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.l
    aunchContainer(DefaultContainerExecutor.java:211)
    at 
    org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.
    ContainerLaunch.call(ContainerLaunch.java:302)

    解決方法:這個錯誤是最難解決的錯誤,最終是用本文提到的插件和集羣版本適配的辦法解決,解決方法及思路見「適配基本過程」。替換或者增長了的jar包列表以下:

    jackson-core-2.2.3.jar
    jersey-json-1.9.jar
    jersey-client-1.9.jar
    jersey-core-1.9.jar
    jackson-xc-1.9.13.jar
    jersey-guice-1.9.jar
    jersey-server-1.9.jar
    jackson-jaxrs-1.9.13.jar
    commons-io-2.5.jar
    htrace-core-3.1.0-incubating.jar
    hermes-index-2.1.2.jar
    hadoop-cdh3-hdfs-2.2.0.jar
    hadoop-cdh3-core-2.2.0.jar
    hadoop-yarn-common-2.7.2.jar
    hadoop-yarn-client-2.7.2.jar
    hadoop-yarn-api-2.7.2.jar
    hadoop-mapreduce-client-jobclient-2.7.2.jar
    hadoop-mapreduce-client-core-2.7.2.jar
    hadoop-mapreduce-client-common-2.7.2.jar
    hadoop-hdfs-2.7.2.jar
    hadoop-common-2.7.2.jar
    hadoop-auth-2.7.2.jar
  • 沒法鏈接yarn的RM任務提交端口
    在TBDS3.0的環境下提交任務後日志提示重連RMserver失敗,一直提示該錯誤
    解決方法:查看啓動進程發現內部集羣接收mr請求的端口爲8032,修改項裏的RMserveraddress配置的端口後任務經過

  • 適配完成替換/新增全部jar包後出現的異常

    Exception in thread "main" java.lang.VerifyError: class 
    org.codehaus.jackson.xc.JaxbAnnotationIntrospector overrides final 
    method findDeserializer.(Lorg/codehaus/jackso
    n/map/introspect/Annotated;)Ljava/lang/Object;
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2615)
    at java.lang.Class.getDeclaredMethods(Class.java:1860)
    at com.sun.jersey.core.reflection.MethodList.getAllDeclaredMethods(Meth
    odList.java:70)
    at com.sun.jersey.core.reflection.MethodList.<init>(MethodList.java:64)
    at com.sun.jersey.core.spi.component.ComponentConstructor.getPostConstr
    uctMethods(ComponentConstructor.java:131)
    at com.sun.jersey.core.spi.component.ComponentConstructor.<init>(ComponentConstructor.java:123)
    at com.sun.jersey.core.spi.component.ProviderFactory.__getComponentProv
    ider(ProviderFactory.java:165)
    at com.sun.jersey.core.spi.component.ProviderFactory._getComponentProvider(ProviderFactory.java:159)
    at com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:153)
    at com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:251)

    解決方法:查詢這個異常類屬於jackson*.jar,那問題就出在這一系列的包身上,檢查發現Hermes-MR-index插件的lib目錄下有

    jackson-core-asl-1.7.3.jar
    jackson-mapper-asl-1.7.3.jar
    jackson-core-asl-1.9.13.jar
    jackson-mapper-asl-1.9.13.jar

    這兩個包的版本有2個,檢查Hadoop集羣用的版本爲1.9.13,將插件lib目錄下的1.7.3版本的兩個包刪除後,插件正常運行。緣由歸結爲jar包版本衝突。

  • 提示沒法找到MR框架路徑

    Exception in thread "main" java.lang.IllegalArgumentException: Could 
    not locate MapReduce framework name 'mr-framework' in 
    mapreduce.application.classpath
    at org.apache.hadoop.mapreduce.v2.util.MRApps.setMRFrameworkClasspath(M
    RApps.java:231)
    at org.apache.hadoop.mapreduce.v2.util.MRApps.setClasspath(MRApps.java:258)
    at org.apache.hadoop.mapred.YARNRunner.createApplicationSubmissionContext(YARNRunner.java:458)
    at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:285)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
    at com.tencent.hermes.hadoop.job.HermesIndexJob.subRun(HermesIndexJob.java:262)
    at com.tencent.hermes.hadoop.job.HermesIndexJob.run(HermesIndexJob.java:122)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at com.tencent.hermes.hadoop.job.SubmitIndexJob.call(SubmitIndexJob.java:194)
    at com.tencent.hermes.hadoop.job.SubmitIndexJob.main(SubmitIndexJob.java:101)

    解決方法:提示mapreduce.application.framework.path配置裏沒找到mr框架的路徑,檢查mapred-site.xml的該配置項確實配置有異常,在該配置項裏增長mr框架路徑後經過(如下紅色爲新增配置)。

<property>
<name>mapreduce.application.classpath</name>      
<value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-
framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-
framework/hadoop/share/hadoop/common/*:$PWD
/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-
framework/hadoop/share/hadoop/yarn/*:$PWD/mr-
framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/sh
are/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.2.0.0-
2041/hadoop/lib/hadoop-lzo-0.6.0.2.2.0.0-
2041.jar:/etc/hadoop/conf/secure</value>
</property>
相關文章
相關標籤/搜索