what is DataX ?java
DataX是阿里巴巴集團內被普遍使用的離線數據同步工具/平臺。
實現包括MySQL,SQL Server,Oracle,PostgreSQL,HDFS,Hive,HBase,OTS,ODPS等各類異構數據源之間高效的數據同步功能。git
1.從Github下載源碼,地址:https://github.com/alibaba/DataX?spm=a2c4e.11153940.blogcont642896.16.7d62ba62hHwiVO&file=DataX.git;github
DataX-master.zip
2.安裝好maven,經過maven打包源碼編譯:apache
mvn -U clean package assembly:assembly -Dmaven.test.skip=true
3.編譯好的datax在 /target/datax/下:json
{YOUR_DATAX_HOME}/target/datax/
4.編譯過程當中出現的報錯解決辦法:vim
4.1.maven配置阿里雲的maven私服eclipse
<mirror>
<id>nexus-aliyun</id>
<mirrorOf>central</mirrorOf>
<name>Nexus aliyun</name>
<url>https://maven.aliyun.com/repository/central</url>
</mirror>
4.2.本地編譯報錯沒法找到工件com.aliyun.openservices:tablestore-streamclient:jar:1.0.0-SNAPSHOT;maven
vim otsstreamreader/pom.xmlide
<dependency>
<groupId>com.aliyun.openservices</groupId>
<artifactId>tablestore-streamclient</artifactId>
<version>1.0.0-SNAPSHOT</version>
</dependency>
此處把
<version>1.0.0-SNAPSHOT</version>
改爲
<version>1.0.0</version>
4.3.編譯datax odps插件模塊會報錯:工具
ERROR] Failed to execute goal on project odpsreader: Could not resolve dependencies for project com.alibaba.datax:odpsreader:jar:0.0.1-SNAPSHOT:
The following artifacts could not be resolved: com.alibaba.datax:datax-common:jar:0.0.1-SNAPSHOT,
com.alibaba.external:bouncycastle.provider:jar:1.38-jdk15: Could not find artifact com.alibaba.datax:datax-common:jar:0.0.1-SNAPSHOT in
dtwave (http://repo2.dtwave-inc.com/repository/public/) -> [Help 1] [ERROR] [ERROR]
To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
比較過以前odps-sdk-core-0.19.3-public.pom的的依賴是
org.bouncycastle bcprov-jdk15on 1.52
如今是
com.alibaba.external bouncycastle.provider 1.38-jdk15
緣由: 後來的這個依賴應該是阿里內部jar,外部倉庫沒法加載這個jar
解決:修改pom.xml
com.aliyun.odps odps-sdk-core 換一下版本 :0.20.7-public
5.編譯成功:
6.測試:在eclipse中創建EngineTest.java,使用默認的job.json: