【原創】大叔經驗分享(6)Oozie如何查看提交到Yarn上的任務日誌

經過oozie job id能夠查看流程詳細信息,命令以下:html

oozie job -info 0012077-180830142722522-oozie-hado-Wnode

 

流程詳細信息以下:sql

Job ID : 0012077-180830142722522-oozie-hado-Wapache

------------------------------------------------------------------------------------------------------------------------------------api

Workflow Name : test_wfapp

App Path      : hdfs://hdfs_name/oozie/test_wf.xmloop

Status        : KILLED大數據

Run           : 0spa

User          : hadoop3d

Group         : -

Created       : 2018-09-25 02:51 GMT

Started       : 2018-09-25 02:51 GMT

Last Modified : 2018-09-25 02:53 GMT

Ended         : 2018-09-25 02:53 GMT

CoordAction ID: -

 

Actions

------------------------------------------------------------------------------------------------------------------------------------

ID                                                                            Status    Ext ID                 Ext Status Err Code 

------------------------------------------------------------------------------------------------------------------------------------

0012077-180830142722522-oozie-hado-W@:start:                                  OK        -                      OK         -        

------------------------------------------------------------------------------------------------------------------------------------

0012077-180830142722522-oozie-hado-W@test_spark_task                  ERROR     application_1537326594090_5663FAILED/KILLEDJA018    

------------------------------------------------------------------------------------------------------------------------------------

0012077-180830142722522-oozie-hado-W@Kill                                     OK        -                      OK         E0729    

------------------------------------------------------------------------------------------------------------------------------------

 

失敗的任務定義以下

<action name="test_spark_task"> 

        <spark xmlns="uri:oozie:spark-action:0.1"> 

            <job-tracker>${job_tracker}</job-tracker> 

            <name-node>${name_node}</name-node> 

            <master>${jobmaster}</master> 

            <mode>${jobmode}</mode> 

            <name>${jobname}</name> 

            <class>${jarclass}</class> 

            <jar>${jarpath}</jar> 

            <spark-opts>--executor-memory 4g --executor-cores 2 --num-executors 4 --driver-memory 4g</spark-opts> 

        </spark>

 

在yarn上能夠看到application_1537326594090_5663對應的application以下

application_1537326594090_5663       hadoop oozie:launcher:T=spark:W=test_wf:A=test_spark_task:ID=0012077-180830142722522-oozie-hado-W         Oozie Launcher

 

查看application_1537326594090_5663日誌發現

2018-09-25 10:52:05,237 [main] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl  - Submitted application application_1537326594090_5664

 

yarn上application_1537326594090_5664對應的application以下

application_1537326594090_5664       hadoop    TestSparkTask SPARK

 

即application_1537326594090_5664纔是Action對應的spark任務,爲何中間會多一步,類結構和核心代碼詳見 http://www.javashuo.com/article/p-snlqpfjk-kp.html

簡要來講,Oozie執行Action時,即ActionExecutor(最主要的子類是JavaActionExecutor,hive、spark等action都是這個類的子類),JavaActionExecutor首先會提交一個LauncherMapper(map任務)到yarn,其中會執行LauncherMain(具體的action是其子類,好比JavaMain、SparkMain等),spark任務會執行SparkMain,在SparkMain中會調用org.apache.spark.deploy.SparkSubmit來提交任務

 

若是提交的是spark任務,那麼按照上邊的方法就能夠跟蹤到實際任務的applicationId;
若是你提交的hive2任務,實際是用beeline啓動,從hive2開始,beeline命令的日誌已經簡化,不像hive命令能夠看到詳細的applicationId和進度,這時有兩種方法:

1)修改hive代碼,使得beeline命令和hive命令同樣有詳細日誌輸出

詳見:http://www.javashuo.com/article/p-truuknfj-d.html

2)根據application tag手工查找任務

oozie在使用beeline提交任務時,會添加一個mapreduce.job.tags參數,好比

--hiveconf
mapreduce.job.tags=oozie-9f896ad3d40c261235dc6858cadb885c

可是這個tag從yarn application命令中查不到,只能手工逐個查找(實際啓動的任務會在當前LuancherMapper的applicationId上遞增),

而後就能夠看到實際啓動的applicationId了

 

另外還能夠從job history server上看到application的詳細信息,好比configuration、task等

查看hive任務執行的完整sql詳見:http://www.javashuo.com/article/p-rpzfhufg-hg.html

相關文章
相關標籤/搜索