CDH的坑之Sqoop導出數據到MySQL

CDH的坑之Sqoop導出數據到MySQL

最近使用Sqoop從Hive導出數據到MySQL中,出現了一系列的問題,下面將這個問題記錄一下,避免再度踩坑!java

導出語句

sqoop export --connect jdbc:mysql://192.168.1.78:3306/data \
--username root \
-P \
--export-dir '/user/hive/warehouse/personas.db/user_attribute/000000_0' \
--table dm_user_attribute \
--input-fields-terminated-by '|' \
--input-null-non-string '\\N' \
--input-null-string '\\N' \
--lines-terminated-by '\n' \
-m 1

運行環境

centOS7+CDH5.7.2+其中集成的Sqoopmysql

錯誤信息

如下是我輸入命令到服務器中,控制檯打印的信息。web

Warning: /opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/07/23 11:54:45 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.7.2
18/07/23 11:54:45 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/07/23 11:54:45 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/07/23 11:54:45 INFO tool.CodeGenTool: Beginning code generation
18/07/23 11:54:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_user_attribute` AS t LIMIT 1
18/07/23 11:54:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_user_attribute` AS t LIMIT 1
18/07/23 11:54:45 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/2322b82e8ef7190a66357528d5fbddae/dm_user_attribute.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/07/23 11:54:47 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/2322b82e8ef7190a66357528d5fbddae/dm_user_attribute.jar
18/07/23 11:54:47 INFO mapreduce.ExportJobBase: Beginning export of dm_user_attribute
18/07/23 11:54:47 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/07/23 11:54:47 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
18/07/23 11:54:48 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
18/07/23 11:54:48 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
18/07/23 11:54:48 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/07/23 11:54:48 INFO client.RMProxy: Connecting to ResourceManager at 192.168.1.152:8032
18/07/23 11:54:49 INFO input.FileInputFormat: Total input paths to process : 1
18/07/23 11:54:49 INFO input.FileInputFormat: Total input paths to process : 1
18/07/23 11:54:49 INFO mapreduce.JobSubmitter: number of splits:1
18/07/23 11:54:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1528444677205_1338
18/07/23 11:54:50 INFO impl.YarnClientImpl: Submitted application application_1528444677205_1338
18/07/23 11:54:50 INFO mapreduce.Job: The url to track the job: http://daojia02:8088/proxy/application_1528444677205_1338/
18/07/23 11:54:50 INFO mapreduce.Job: Running job: job_1528444677205_1338
18/07/23 11:54:55 INFO mapreduce.Job: Job job_1528444677205_1338 running in uber mode : false
18/07/23 11:54:55 INFO mapreduce.Job:  map 0% reduce 0%
18/07/23 11:55:00 INFO mapreduce.Job:  map 100% reduce 0%
18/07/23 11:55:01 INFO mapreduce.Job: Job job_1528444677205_1338 failed with state FAILED due to: Task failed task_1528444677205_1338_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

18/07/23 11:55:01 INFO mapreduce.Job: Counters: 8
	Job Counters 
		Failed map tasks=1
		Launched map tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=2855
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=2855
		Total vcore-seconds taken by all map tasks=2855
		Total megabyte-seconds taken by all map tasks=2923520
18/07/23 11:55:01 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
18/07/23 11:55:01 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 13.576 seconds (0 bytes/sec)
18/07/23 11:55:01 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
18/07/23 11:55:01 INFO mapreduce.ExportJobBase: Exported 0 records.
18/07/23 11:55:01 ERROR tool.ExportTool: Error during export: Export job failed!

當我看到這個控制檯打印的信息時,猶如一萬隻草泥馬狂奔而過,這是什麼鬼?只告訴你導出失敗,任務中斷了,錯誤信息呢?你看到是否是也是同樣的感受呢?這該如何解決?從何入手呢?sql

Sqoop的錯誤日誌

通過兩天的各類搞頭,最後終於知道了如何解決這個問題,這個問題不是具體的問題,可是想要知道具體的錯誤信息,在控制檯是看不到的,只能到CDH的web管理界面去看,以下就告訴你們CDH的管理界面怎麼找到Sqoop的這個任務日誌。apache

第一步

以下圖:點擊YAEN進入YARN的詳情界面。有人會問,爲何不是Sqoop的界面,Sqoop最終會轉化爲MR進行任務的執行,因此這裏要看Sqoop的任務執行狀況,仍是要到YARN的詳情界面去看。bash

第二步

以下圖爲YARN的詳情界面,須要點擊應用程序目錄,進入任務的執行結果列表中,能夠看到各個執行的任務,以及執行的結果,下圖明顯看到有一個錯誤。根據以下的操做進入下一個頁面。服務器

第三步

這個界面展現了單個任務的還算詳細的任務信息,不過這不是咱們最終要找的界面,看到以下圖框起來的logs超連接字段,點擊進入下一個頁面。app

第四步

看到這個界面,好像是找到了日誌的界面,對不起,尚未,向下拉,你會看到如圖的字樣,這個頁面只是展現了任務執行的流程,具體的錯誤信息還在另一個頁面。點擊如圖here超連接的字樣,進入下一個頁面。ide

第五步

通過前面的幾個頁面,咱們終於進入了咱們想要看到的頁面,咱們親愛的錯誤頁面,在這裏,就能夠看到這個任務的錯誤緣由,這樣就能夠根據錯誤信息解決問題了。這個頁面展現的錯誤信息的解決方法,網上基本都有,能夠根據錯誤信息自行查找了。oop

本人這裏展示的問題,是由於Hive和MySQL的時間字段不匹配致使的,這裏更改MySQL或者Hive的時間字段類型,讓兩邊的類型保持一致,便可解決問題。

真的沒想到,CDH會這麼坑,這個問題,整整折磨了我兩天,不過還好,最終仍是解決了,之後再遇到以後,就會能夠當即解決了。

上一篇:Centos 7+CDH5.7.2所有署流程

下一篇:CDH的坑之Deploy Client Configuration Failed

相關文章
相關標籤/搜索