聲明:做者原創,轉載註明出處。做者:帥氣陳吃蘋果html
下載解壓便可,下載地址:https://pan.baidu.com/s/1i51UsVNjava
下載解壓便可,下載地址:https://pan.baidu.com/s/1i57ZXqt
配置環境變量:
在系統變量中新建變量:HADOOP_HOME,值:E:Hadoophadoop-2.6.5
在Path系統變量中添加Hadoop的/bin路徑,值:E:Hadoophadoop-2.6.5binnode
確保集羣處於啓動狀態,而且windows本地機器與集羣中的master能夠互相ping通,而且能夠進行SSH鏈接;
在 C:WindowsSystem32driversetchosts文件中,追加Hadoop集羣master節點的IP地址和主機名映射,以下:apache
192.168.29.188 vnet
windows
下載地址:https://pan.baidu.com/s/1o7791VGapp
下載後將插件放在Eclipse安裝目錄的plugins目錄下,重啓Eclipse便可。oop
1)重啓Eclipse後,在左側欄能夠看到此視圖:ui
打開Window--->Perspective--->Open Perspective--->Other...,選擇Map/Reduce。若沒有看到此選項,在確保插件放入plugins目錄後已經重啓的狀況下,猜想多是Eclipse或插件的版本問題致使,需從新下載相匹配的版本。spa
<img width="300" src="https://i.imgur.com/Twag1wi.p...; />.net
2)打開Window--->Preferences--->Hadoop Map/Reduce,配置Hadoop的安裝目錄。
<img width="600" src="https://i.imgur.com/1jCAkYr.p...; />
在Eclipse底部欄中選擇Map/Reduce Locations視圖,右鍵選擇New Hadoop Locations,以下圖:
<img width="700" src="https://i.imgur.com/NPaZQXL.p...; />
具體配置以下:
<img width="600" src="https://i.imgur.com/vDAsRBj.p...; />
點擊finish,若沒有報錯,則表示鏈接成功,在Eclipse左側的DFS Locations中能夠看到HDFS文件系統的目錄結構和文件內容;
若遇到 An internal error occurred during: "Map/Reduce location status updater". java.lang.NullPointerExcept
的問題,則表示當前HDFS文件系統爲空,只需在HDFS文件系統上建立文件,刷新DFS Locations後便可看到文件系統內容;
在master節點上建立輸入文件,並上傳到HDFS對應的輸入目錄中,以下:
vi input.txt //而後輸入單詞計數的文件內容,保存 hdfs dfs -put input.txt /user/root/input/ //將Linux本地文件系統的文件上傳到HDFS上
input.txt
hello world hello hadoop bye bye hadoop
File--->New--->Project--->Map/Reduce Project,填入項目名稱,還須要選擇Hadoop Library的路徑,這裏選擇「Use default Hadoop」便可,就是咱們以前在Eclipse中配置的Hadoop。
WordCount.java代碼:
package com.wecon.sqchen; import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; public class WordCount { public static class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> { private final IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer token = new StringTokenizer(line); while (token.hasMoreTokens()) { word.set(token.nextToken()); context.write(word, one); } } } public static class WordCountReduce extends Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { System.setProperty("hadoop.home.dir","E:/Hadoop/hadoop-2.6.5" ); Configuration conf = new Configuration(); Job job = new Job(conf); job.setJarByClass(WordCount.class); job.setJobName("wordcount"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordCountMap.class); job.setReducerClass(WordCountReduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); } }
右鍵打開Run AS ---> Run Configurations,配置Arguments,即程序中指定的文件輸入目錄和輸出目錄,以下:
<img width="600" src="https://i.imgur.com/pFqvNr2.p...; />
配置好後,Run AS---> Java Application,若無報錯,則表示程序執行成功,在Eclipse左側的
DFS Locations刷新後,能夠看到輸出目錄和輸出文件,以下:
1)java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
解決方式:
在main方法中、job提交以前,指定本地Hadoop的安裝路徑,即添加下列代碼:System.setProperty("hadoop.home.dir","E:/Hadoop/hadoop-2.6.5" );
2)`(null) entry in command string: null chmod 0700 E:tmphadoop-Administratormapredstaging
Administr`
解決方式:
參考連接:https://ask.hellobi.com/blog/...
連接中所需文件下載地址:https://pan.baidu.com/s/1i4Z4aVV
3)org.apache.hadoop.security.AccessControlException: Permission denied: user=Administrator, access=WRITE, inode="/user/root":root:supergroup:drwxr-xr-x
解決方式:
這是本地用戶執行Application時,HDFS上的用戶權限問題;
參考連接:http://blog.csdn.net/Camu7s/a...
採用第三種方法,在master節點機器上執行下列命令:
adduser Administrator groupadd supergroup usermod -a -G supergroup Administrator
4)org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://vnet:9000/user/root/output already exists
解決方式:
這是由於該項目的輸出目錄在HDFS中已經存在,而輸出目錄是在程序運行過程當中建立的,不容許提早存在,因此只需刪除HDFS上的對應output目錄便可。
5)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib. MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
解決方式:
在項目的src目錄下,New--->Other--->General--->File,建立文件「log4j.properties」,文件內容以下:
log4j.rootLogger=WARN, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
http://blog.csdn.net/bd_ai_io...
http://blog.csdn.net/songchun...