搭建Hadoop集羣: VirtualBox+Ubuntu 14.04+Hadoop2.6.0java
搭建好集羣后, 在Mac下安裝Eclipse並鏈接Hadoop集羣git
添加Master的IP到Mac的hostsgithub
## # Host Database # # localhost is used to configure the loopback interface # when the system is booting. Do not change this entry. ## 127.0.0.1 localhost 255.255.255.255 broadcasthost ::1 localhost 192.168.56.101 Master # 添加Master的IP
Master下, 啓動集羣
Mac下, 打開http://master:50070/
可以成功訪問, 看到集羣的信息, 就能夠了apache
Eclipse IDE for Java Developerssegmentfault
http://www.eclipse.org/downloads/package...app
可下載 Github 上的 hadoop2x-eclipse-plugin(備用下載地址:http://pan.baidu.com/s/1i4ikIoP)eclipse
在Applications中找個Eclise, 右鍵, Show Package Contentside
將插件複製到plugins目錄下, 而後從新打開Eclipse就能夠了oop
將Hadoop安裝包解壓到任何目錄, 不用作任何配置, 而後在Eclipse中指向該目錄便可ui
點擊右上角的加號
添加Map/Reduce視圖
選擇Map/Reduce Locations, 而後右鍵, 選擇New Hadoop location
須要改Location name, Host, DFS Master下的Port, User name ( Master會引用Mac中的hosts配置的IP ), 完成後, Finish
查看是否能夠直接訪問HDFS
File -> New -> Other -> Map/Reduce Project
輸入項目名: WordCount, 而後點擊, Finish
建立一個類, 報名org.apache.hadoop.examples, 類名: WordCount
複製下面的代碼到WordCount.java中
package org.apache.hadoop.examples; import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println("Usage: wordcount <in> <out>"); System.exit(2); } Job job = new Job(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
將全部修改過的配置文件和log4j.properties, 複製到src目標下
這裏我複製了slaves, core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml
鼠標移動到WordCount.java上, 右鍵, Run As, Java Application
此時, 程序不會正常運行. 再次右鍵, Run As, 選擇Run Configurations
填入輸入輸出路徑 (空格分割)
配置完成後點擊, Run. 此時會出現, Permission denied
沒有權限訪問HDFS
# 假設Mac的用戶名爲hadoop groupadd supergroup # 添加supergroup組 useradd -g supergroup hadoop # 添加hadoop用戶到supergroup組 # 修改hadoop集羣中hdfs文件的組權限, 使屬於supergroup組的全部用戶都有讀寫權限 hadoop fs -chmod 777 /
http://apache.claz.org/hadoop/common/had...
右上角的搜索框中, 搜索Open Type
輸入NameNode, 選擇NameNode, 發現看不了源碼
點擊Attach Source -> External location -> External Floder