環境需求: 系統:window 10 eclipse版本:Mars Hadoop版本:2.6.0html
資源需求:解壓後的Hadoop-2.6.0,原壓縮包自行下載:下載地址java
醜話前頭說:apache
如下的操做中,eclipse的啓動均須要右鍵「管理員運行」!app
在建立MapReduce的Project那塊須要配置log4j(級別是debug),不然打印不出一些調試的信息,從而很差找出錯的緣由。配置這個log4j很簡單,你們能夠在網上搜索一下,應該能夠找獲得相關的配置。eclipse
1)首先須要利用ant編譯本身的Hadoop-eclipse-plugin插件,你也能夠本身網上搜索下載,我不喜歡用別人的東西,因此本身編譯了一把,大家也能夠參考個人另外一篇博文,學着本身編譯——《利用Apache Ant編譯Hadoop2.6.0-eclipse-plugin》ide
2)把編譯好的Hadoop插件放到eclipse目錄下的plugins下,而後重啓eclipseoop
3)打開window-->Preferences-->Hadoop Map/Reduce設置裏面的Hadoop安裝目錄post
4)打開window-->Show View找到MapReduce Tools下的Map/Reduce Location,肯定this
5)而後在eclipse的主界面就能夠看到Map/Reduce Location的對話框了url
6)新建一個Hadoop Location,修改HDFS和yarn的主節點和端口,finish。
7)這時,在eclipse的Project Explorer中會看到HDFS的目錄結構——DFS Locations
注意:可能你打開這個目錄結構的時候回存在權限問題(Premission),這是由於你在Hadoop的HDFS的配置文件hdfs-site.xml中沒有配置權限(默認是true,意思是不能被集羣外的節點訪問HDFS文件目錄),咱們須要在這兒配置爲false,重啓hdfs服務,而後刷新上述dfs目錄便可:
<property> <name>dfs.permissions.enabled</name> <value>false</value> </property>
8)而後咱們建立一個Map/Reduce Project,建立一個wordcount程序,我把Hadoop的README.txt傳到/tmp/mrchor/目錄下並更名爲readme,輸出路徑爲/tmp/mrchor/out。
package com.mrchor.HadoopDev.hadoopDev; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCountApp { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, WordCountApp.class.getSimpleName()); job.setJarByClass(com.mrchor.HadoopDev.hadoopDev.WordCountApp.class); // TODO: specify a mapper job.setMapperClass(MyMapper.class); // TODO: specify a reducer job.setReducerClass(MyReducer.class); // TODO: specify output types job.setOutputKeyClass(Text.class); job.setOutputValueClass(LongWritable.class); // TODO: specify input and output DIRECTORIES (not files) FileInputFormat.setInputPaths(job, new Path("hdfs://master:8020/tmp/mrchor/readme")); FileOutputFormat.setOutputPath(job, new Path("hdfs://master:8020/tmp/mrchor/out")); if (!job.waitForCompletion(true)) return; } public static class MyMapper extends Mapper<LongWritable, Text, Text, LongWritable>{ Text k2 = new Text(); LongWritable v2 = new LongWritable(); @Override protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, LongWritable>.Context context) throws IOException, InterruptedException { String[] split = value.toString().split(" "); for (String word : split) { k2.set(word); v2.set(1); context.write(k2, v2); } } } public static class MyReducer extends Reducer<Text, LongWritable, Text, LongWritable>{ long sum = 0; @Override protected void reduce(Text k2, Iterable<LongWritable> v2s, Reducer<Text, LongWritable, Text, LongWritable>.Context context) throws IOException, InterruptedException { for (LongWritable one : v2s) { sum+=one.get(); } context.write(k2, new LongWritable(sum)); } } }
9)右鍵Run As-->Run on Hadoop:
A)注意:這邊可能報錯:
java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
這是由於你在安裝eclipse的這臺機子上沒有配置Hadoop的環境變量,須要配置一下:
一)右鍵「個人電腦」或者「此電腦」選擇屬性:進入到高級系統設置-->高級-->環境變量配置-->系統變量
新建一個HADOOP_HOME,配置解壓後的Hadoop-2.6.0的目錄
二)重啓eclipse(管理員運行)
10)繼續運行wordcount程序,Run on Hadoop,可能會報以下錯:
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:557)
at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:187)
at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)
at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:536)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
at com.mrchor.HadoopDev.hadoopDev.WordCountApp.main(WordCountApp.java:34)
經過源碼查看,發如今NativeIO.java有說明——仍是權限問題,多是須要將當前電腦加入到HDFS受權的用戶組:
/** * Checks whether the current process has desired access rights on * the given path. * * Longer term this native function can be substituted with JDK7 * function Files#isReadable, isWritable, isExecutable. * * @param path input path * @param desiredAccess ACCESS_READ, ACCESS_WRITE or ACCESS_EXECUTE * @return true if access is allowed * @throws IOException I/O exception on error */
可是,咱們這邊有一個更加巧妙的辦法解決這個問題——將源碼中的這個文件複製到你的MapReduce的Project中,這個意思是程序在執行的時候回優先找你Project下的class做爲程序的引用,而不會去引入的外部jar包中找:
11)繼續運行wordcount程序,此次應該程序能夠執行了,結果爲:
若是獲得上面這個結果,說明程序運行正確,打印出來的是MapReduce程序運行結果。咱們再刷新目錄,能夠看到/tmp/mrchor/out目錄下有兩個文件——_SUCCESS和part-r-00000:
說明程序運行結果正確,此時,咱們的eclipse遠程調試Hadoop宣告成功!!!你們鼓掌O(∩_∩)O