Mac下Eclipse提交任務到Hadoop集羣

搭建Hadoop集羣: VirtualBox+Ubuntu 14.04+Hadoop2.6.0java

搭建好集羣后, 在Mac下安裝Eclipse並鏈接Hadoop集羣git

1. 訪問集羣

1.1. 修改Mac的hosts

添加Master的IP到Mac的hostsgithub

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
127.0.0.1    localhost
255.255.255.255    broadcasthost
::1             localhost

192.168.56.101  Master # 添加Master的IP

1.2 訪問集羣

Master下, 啓動集羣
Mac下, 打開http://master:50070/
可以成功訪問, 看到集羣的信息, 就能夠了apache

2. 下載安裝Eclipse

Eclipse IDE for Java Developerssegmentfault

http://www.eclipse.org/downloads/package...app

3. 配置Eclipse

3.1 配置Hadoop-Eclipse-Plugin

3.1.1 下載Hadoop-Eclipse-Plugin

可下載 Github 上的 hadoop2x-eclipse-plugin(備用下載地址:http://pan.baidu.com/s/1i4ikIoPeclipse

3.1.2 安裝Hadoop-Eclipse-Plugin

在Applications中找個Eclise, 右鍵, Show Package Contentside

圖片描述

將插件複製到plugins目錄下, 而後從新打開Eclipse就能夠了oop

圖片描述

3.2 鏈接Hadoop集羣

3.2.1 配置Hadoop安裝目錄

將Hadoop安裝包解壓到任何目錄, 不用作任何配置, 而後在Eclipse中指向該目錄便可ui

圖片描述

3.2.2 配置集羣地址

點擊右上角的加號

圖片描述

添加Map/Reduce視圖

圖片描述

選擇Map/Reduce Locations, 而後右鍵, 選擇New Hadoop location

圖片描述

須要改Location name, Host, DFS Master下的Port, User name ( Master會引用Mac中的hosts配置的IP ), 完成後, Finish

圖片描述

3.2.3 查看HDFS

查看是否能夠直接訪問HDFS

圖片描述

4. 集羣中運行WordCount

4.1 建立項目

File -> New -> Other -> Map/Reduce Project

輸入項目名: WordCount, 而後點擊, Finish

4.2 建立類

建立一個類, 報名org.apache.hadoop.examples, 類名: WordCount

4.3 WordCount代碼

複製下面的代碼到WordCount.java中

package org.apache.hadoop.examples;
 
import java.io.IOException;
import java.util.StringTokenizer;
 
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
 
public class WordCount {
 
  public static class TokenizerMapper 
       extends Mapper<Object, Text, Text, IntWritable>{
 
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
 
    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        context.write(word, one);
      }
    }
  }
 
  public static class IntSumReducer 
       extends Reducer<Text,IntWritable,Text,IntWritable> {
    private IntWritable result = new IntWritable();
 
    public void reduce(Text key, Iterable<IntWritable> values, 
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }
 
  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: wordcount <in> <out>");
      System.exit(2);
    }
    Job job = new Job(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

4.4 配置Hadoop參數

將全部修改過的配置文件和log4j.properties, 複製到src目標下

這裏我複製了slaves, core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml

4.4 配置HDFS輸入輸出路徑

鼠標移動到WordCount.java上, 右鍵, Run As, Java Application

圖片描述

此時, 程序不會正常運行. 再次右鍵, Run As, 選擇Run Configurations

填入輸入輸出路徑 (空格分割)

圖片描述

配置完成後點擊, Run. 此時會出現, Permission denied

5. 運行中出現的問題

5.1 Permission denied

沒有權限訪問HDFS

# 假設Mac的用戶名爲hadoop
groupadd supergroup # 添加supergroup組
useradd -g supergroup hadoop # 添加hadoop用戶到supergroup組

# 修改hadoop集羣中hdfs文件的組權限, 使屬於supergroup組的全部用戶都有讀寫權限
hadoop fs -chmod 777 /

6. 查看Hadoop源碼

6.1 下載源碼

http://apache.claz.org/hadoop/common/had...

6.2 連接源碼

右上角的搜索框中, 搜索Open Type

圖片描述

輸入NameNode, 選擇NameNode, 發現看不了源碼

點擊Attach Source -> External location -> External Floder

圖片描述

參考資料

使用Eclipse編譯運行MapReduce程序 Hadoop2.6.0_Ubuntu/CentOS

相關文章
相關標籤/搜索