導入Hadoop相關源碼,真是一件不容易的事情,各類錯誤,各類紅,讓你體驗一下解決萬里江山一片紅的爽快!java
Hadoop源碼:本人這裏選擇的是hadoop-2.7.1-src.tar.gz
下載地址:https://archive.apache.org/dist/hadoop/common/
JDK:2.7版本的Hadoop建議使用1.7的JDK。本人這裏選擇的是:jdk-7u80-windows-x64.exe
Eclipse:Oxygen.2 Release (4.7.2)
Maven:apache-maven-3.3.1.zip
下載地址:http://mirrors.shu.edu.cn/apache/maven/
歷史下載地址:https://archive.apache.org/dist/maven/binaries/
libprotoc:protoc-2.5.0-win32.zip
推薦版本下載地址:https://github.com/protocolbuffers/protobuf/releases
2.5.0版本下載地址:https://github.com/protocolbuffers/protobuf/releases?after=v3.0.0-alpha-4.1node
本人這裏安裝的是jdk-7u80-windows-x64.exe,安裝步驟忽略。git
直接解壓就可使用。github
安裝參見:Maven介紹及安裝
最好將Maven的遠程倉庫地址設置成國內的倉庫,這樣下載速度會快一些。如下提供國內的遠程倉庫地址:web
阿里:shell
<mirror> <id>nexus-aliyun</id> <mirrorOf>*</mirrorOf> <name>Nexus aliyun</name> <url>http://maven.aliyun.com/nexus/content/groups/public</url> </mirror>
如今官方有不少版本,可是Hadoop2.7.1版本只能使用protoc2.5.0版本。
這個版本比較簡單,解壓以後只有兩個文件,一個是執行文件:protoc.exe,一個是說明文件:readme.txt。以下圖:apache
這裏有兩種方式添加環境變量:windows
第一:將文件解壓到本身指定的目錄,而後將路徑添加到環境變量Path中。使用如下命令測試安裝是否成功:api
protoc --version
以下圖表示安裝成功:bash
第二:將可執行文件protoc.exe直接放入Maven的bin目錄中便可。
此可執行文件沒有多餘的依賴,只要系統可以找到此可執行文件執行便可。
安裝好上面的軟件,就能夠開始進行源碼導入的步驟了!
將Hadoop的源碼解壓到本身規劃的目錄,最好是根目錄。
進入Hadoop源碼中的hadoop-maven-plugins文件夾中,打開cmd命令窗口,執行以下命令:
mvn install
這個過程當中,會下載不少東西,會由於某些東西下載不成功而執行失敗,重複執行此命令,看到以下界面,證實這個過程執行成功。
以下界面的錯誤就是爲何必須使用libprotoc2.5.0的緣由了,本人使用3.3.0版本試過,不行,並且指明須要2.5.0版本。
在Hadoop源碼的根目錄打開cmd命令窗口,執行以下命令:
mvn eclipse:eclipse -DskipTests
出現以下界面即爲成功,如不成功,繼續執行上述命令便可。
爲了方便管理,在Eclipse中建立一個目錄用於存放Hadoop相關的源碼。建立步驟以下圖:
而後點擊File->Import,以下圖:
彈出對話框,在Maven中查找Existing Maven Projects,點擊next,以下圖:
出現以下圖的界面,按圖中操做便可,注意選擇路徑的時候,選擇到源碼的根路徑,否則若是有其餘項目,導入選擇對勾的時候會很麻煩。
上面說的麻煩就是下圖,若是你選到Hadoop源碼的根目錄,那麼直接點擊select All便可點擊完成。
導入以後,本人的界面是這樣的,以下圖:
萬里江山一片紅,看到就頭疼啊,可是觀看源碼是沒問題的。
由上述生成導入Eclipse中目錄的命令中能夠看出,Hadoop的項目排序應該是以下這樣的:
[INFO] Apache Hadoop Main [INFO] Apache Hadoop Project POM [INFO] Apache Hadoop Annotations [INFO] Apache Hadoop Project Dist POM [INFO] Apache Hadoop Assemblies [INFO] Apache Hadoop Maven Plugins [INFO] Apache Hadoop MiniKDC] [INFO] Apache Hadoop Auth [INFO] Apache Hadoop Auth Examples [INFO] Apache Hadoop Common [INFO] Apache Hadoop NFS [INFO] Apache Hadoop KMS [INFO] Apache Hadoop Common Project [INFO] Apache Hadoop HDFS [INFO] Apache Hadoop HttpFS [INFO] Apache Hadoop HDFS BookKeeper Journal [INFO] Apache Hadoop HDFS-NFS [INFO] Apache Hadoop HDFS Project [INFO] hadoop-yarn [INFO] hadoop-yarn-api [INFO] hadoop-yarn-common [INFO] hadoop-yarn-server [INFO] hadoop-yarn-server-common [INFO] hadoop-yarn-server-nodemanager [INFO] hadoop-yarn-server-web-proxy [INFO] hadoop-yarn-server-applicationhistoryservice [INFO] hadoop-yarn-server-resourcemanager [INFO] hadoop-yarn-server-tests [INFO] hadoop-yarn-client [INFO] hadoop-yarn-server-sharedcachemanager [INFO] hadoop-yarn-applications [INFO] hadoop-yarn-applications-distributedshell [INFO] hadoop-yarn-applications-unmanaged-am-launcher [INFO] hadoop-yarn-site [INFO] hadoop-yarn-registry [INFO] hadoop-yarn-project [INFO] hadoop-mapreduce-client [INFO] hadoop-mapreduce-client-core [INFO] hadoop-mapreduce-client-common [INFO] hadoop-mapreduce-client-shuffle [INFO] hadoop-mapreduce-client-app [INFO] hadoop-mapreduce-client-hs [INFO] hadoop-mapreduce-client-jobclient [INFO] hadoop-mapreduce-client-hs-plugins [INFO] Apache Hadoop MapReduce Examples [INFO] hadoop-mapreduce [INFO] Apache Hadoop MapReduce Streaming [INFO] Apache Hadoop Distributed Copy [INFO] Apache Hadoop Archives [INFO] Apache Hadoop Rumen [INFO] Apache Hadoop Gridmix [INFO] Apache Hadoop Data Join [INFO] Apache Hadoop Ant Tasks [INFO] Apache Hadoop Extras [INFO] Apache Hadoop Pipes [INFO] Apache Hadoop OpenStack support [INFO] Apache Hadoop Amazon Web Services support [INFO] Apache Hadoop Azure support [INFO] Apache Hadoop Client [INFO] Apache Hadoop Mini-Cluster [INFO] Apache Hadoop Scheduler Load Simulator [INFO] Apache Hadoop Tools Dist [INFO] Apache Hadoop Tools [INFO] Apache Hadoop Distribution
這裏的兩步是每一個項目都須要執行的。
將全部的項目修改pom.xml的繼承關係進行從新賦予,讓項目有統一的Group Id和version號。
以下圖:打開pom文件從新選一下parent便可。
將Java Build Path中的Libraries裏的JRE和tools.jar修改爲本身的版本,本人這裏是1.7.0_80,以下圖所示:
修改完成Java Build Path以後修改Java Compiler,將其修改成對應的版本便可,本人這裏依然是1.7版本。
以下圖:將此xml文件的頭挪到第一行。
具體信息可參見:xml文件錯誤之指令不容許匹配
hadoop-common項目中有一個錯誤,其中avsc文件是avro的模式文件,這裏須要經過如下方式,生成相應的.java文件。
jar包:avro-tools-1.7.4.jar
下載地址:https://archive.apache.org/dist/maven/binaries/
進入源碼根目錄下的「hadoop-common-project\hadoop-common\src\test\avro」中,打開cmd執行以下命令:
java -jar <所在目錄>\avro-tools-1.7.4.jar compile schema avroRecord.avsc ..\java #例如:本人這裏將此jar包放入了F:\\bigdata\hadoop中 #相應的命令以下: java -jar F:\\bigdata\hadoop\avro-tools-1.7.4.jar compile schema avroRecord.avsc ..\java
右鍵單擊Eclipse中的hadoop-common項目,而後refresh。若是refresh不成功,直接refresh出錯源碼文件所在的包,再不成功則重啓Eclipse
進入源碼根目錄下的「hadoop-common-project\hadoop-common\src\test\proto」,打開cmd命令窗口,執行以下命令:
protoc --java_out=..\java *.proto
這裏的protoc就是在上面下載的protoc程序。
右鍵單擊Eclipse中的hadoop-common,而後refresh。若是refresh不成功,直接refresh出錯源碼文件所在的包,再不成功則重啓Eclipse。
在eclipse中,右鍵單擊hadoop-streaming項目,選擇「Properties」,左側欄選擇Java Build Path,而後右邊選擇Source標籤頁,刪除出錯的路徑。
點擊「Link Source按鈕」,選擇被連接的目錄爲「<你的源代碼根目錄>/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf」,連接名可使用顯示的(也能夠隨便取);
inclusion patterns中添加capacity-scheduler.xml,exclusion patters中添加**/*.java,這個信息與出錯的那項同樣;完畢後將出錯的項刪除。刷新hadoop-streaming項目。
作完上面的排錯,還有不少錯誤,這些錯誤在pom.xml就能看見,以下圖:
這些錯誤一樣在Maven的Lifecycle Mapping中也能看到,以下圖的位置:
上圖是我處理完了錯誤,因此全是綠的了。
Eclipse中的Windows-Preferences,找到Maven-Lifecycle Mappings以下圖:
上圖紅框中的路徑中其實沒有lifecycle-mapping-metadata.xml文件的,這個文件存放於Eclipse的安裝目錄中的一個jar包裏,位置以下:
eclipse\plugins\org.eclipse.m2e.lifecyclemapping.defaults_xxxxxxxxxxxx.jar,以下圖:
將此文件解壓出來,放置到Change mapping file location所示的路徑中去,而後添加缺失的插件,格式文件中都有。
這裏是本人添加的一份文件。文件內容以下:
<?xml version="1.0" encoding="UTF-8"?> <lifecycleMappingMetadata> <lifecycleMappings> <lifecycleMapping> <packagingType>war</packagingType> <lifecycleMappingId>org.eclipse.m2e.jdt.JarLifecycleMapping</lifecycleMappingId> </lifecycleMapping> </lifecycleMappings> <pluginExecutions> <!-- standard maven plugins --> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-resources-plugin</artifactId> <goals> <goal>resources</goal> <goal>testResources</goal> <goal>copy-resources</goal> </goals> <versionRange>[2.4,)</versionRange> </pluginExecutionFilter> <action> <execute> <runOnIncremental>true</runOnIncremental> </execute> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-resources-plugin</artifactId> <goals> <goal>resources</goal> <goal>testResources</goal> <goal>copy-resources</goal> </goals> <versionRange>[0.0.1,2.4)</versionRange> </pluginExecutionFilter> <action> <error> <message>maven-resources-plugin prior to 2.4 is not supported by m2e. Use maven-resources-plugin version 2.4 or later.</message> </error> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-enforcer-plugin</artifactId> <goals> <goal>enforce</goal> </goals> <versionRange>[1.0-alpha-1,)</versionRange> </pluginExecutionFilter> <action> <ignore> <message>maven-enforcer-plugin (goal "enforce") is ignored by m2e.</message> </ignore> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-invoker-plugin</artifactId> <goals> <goal>install</goal> </goals> <versionRange>[1.6-SONATYPE-r940877,)</versionRange> </pluginExecutionFilter> <action> <ignore> <message>maven-invoker-plugin (goal "install") is ignored by m2e.</message> </ignore> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-remote-resources-plugin</artifactId> <versionRange>[1.0,)</versionRange> <goals> <goal>process</goal> </goals> </pluginExecutionFilter> <action> <ignore> <message>maven-remote-resources-plugin (goal "process") is ignored by m2e.</message> </ignore> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-eclipse-plugin</artifactId> <versionRange>[0,)</versionRange> <goals> <goal>configure-workspace</goal> <goal>eclipse</goal> <goal>clean</goal> <goal>to-maven</goal> <goal>install-plugins</goal> <goal>make-artifacts</goal> <goal>myeclipse</goal> <goal>myeclipse-clean</goal> <goal>rad</goal> <goal>rad-clean</goal> </goals> </pluginExecutionFilter> <action> <error> <message>maven-eclipse-plugin is not compatible with m2e</message> </error> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-source-plugin</artifactId> <versionRange>[2.0,)</versionRange> <goals> <goal>jar-no-fork</goal> <goal>test-jar-no-fork</goal> <!-- theoretically, the following goals should not be bound to lifecycle, but ignore them just in case --> <goal>jar</goal> <goal>aggregate</goal> <goal>test-jar</goal> </goals> </pluginExecutionFilter> <action> <ignore/> </action> </pluginExecution> <!--our add start******************************************************--> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-antrun-plugin</artifactId> <versionRange>[1.7,)</versionRange> <goals> <goal>run</goal> <goal>create-testdirs</goal> <goal>validate</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <versionRange>[3.1,)</versionRange> <goals> <goal>testCompile</goal> <goal>compile</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <versionRange>[2.5,)</versionRange> <goals> <goal>test-compile</goal> <goal>test-jar</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-maven-plugins</artifactId> <versionRange>[2.7.1,)</versionRange> <goals> <goal>protoc</goal> <goal>compile-protoc</goal> <goal>generate-sources</goal> <goal>compile-test-protoc</goal> <goal>generate-test-sources</goal> <goal>version-info</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-plugin-plugin</artifactId> <versionRange>[3.4,)</versionRange> <goals> <goal>descriptor</goal> <goal>default-descriptor</goal> <goal>process-classes</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.apache.avro</groupId> <artifactId>avro-maven-plugin</artifactId> <versionRange>[1.7.4,)</versionRange> <goals> <goal>schema</goal> <goal>generate-avro-test-sources</goal> <goal>generate-test-sources</goal> <goal>protocol</goal> <goal>default</goal> <goal>generate-sources</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.codehaus.mojo</groupId> <artifactId>exec-maven-plugin</artifactId> <versionRange>[1.3.1,)</versionRange> <goals> <goal>exec</goal> <goal>compile-ms-native-dll</goal> <goal>compile-ms-winutils</goal> <goal>compile</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.codehaus.mojo</groupId> <artifactId>native-maven-plugin</artifactId> <versionRange>[1.0-alpha-8,)</versionRange> <goals> <goal>javah</goal> <goal>compile</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <!--our add end****************************************--> <!-- commonly used codehaus plugins --> <pluginExecution> <pluginExecutionFilter> <groupId>org.codehaus.mojo</groupId> <artifactId>animal-sniffer-maven-plugin</artifactId> <versionRange>[1.0,)</versionRange> <goals> <goal>check</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> <pluginExecution> <pluginExecutionFilter> <groupId>org.codehaus.mojo</groupId> <artifactId>buildnumber-maven-plugin</artifactId> <versionRange>[1.0-beta-1,)</versionRange> <goals> <goal>create</goal> </goals> </pluginExecutionFilter> <action> <ignore /> </action> </pluginExecution> </pluginExecutions> </lifecycleMappingMetadata>
進行完上面的步驟以後,將全部的項目都Update一下,操做以下圖:
通過上述的步驟以後,全部的問題應該都能解決了。
以上是本人導入源碼的過程,基本上就這些錯誤,除了那三個典型的錯誤,還出現了多餘的幾個錯誤!
在運行源碼的時候也出現了一些錯誤,後續會進行更新!
上一篇:Hadoop-MapReduce的shuffle過程及其餘
下一篇: