IDEA遠程提交hadoop任務

IDEA遠程提交hadoop任務

  • 新建maven項目,添加以下依賴
<dependencies>
     <dependency>
         <groupId>org.apache.hadoop</groupId>
         <artifactId>hadoop-common</artifactId>
         <version>2.7.1</version>
     </dependency>

     <dependency>
         <groupId>org.apache.hadoop</groupId>
         <artifactId>hadoop-mapreduce-client-core</artifactId>
         <version>2.7.1</version>
     </dependency>

     <dependency>
         <groupId>org.apache.hadoop</groupId>
         <artifactId>hadoop-hdfs</artifactId>
         <version>2.7.1</version>
     </dependency>

     <dependency>
         <groupId>org.apache.hadoop</groupId>
         <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
         <version>2.7.1</version>
     </dependency>
 </dependencies>
  • 編寫Map處理
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
      @Override
      protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
          String line = value.toString();
          System.out.println("行值:" + line);
          StringTokenizer tokenizer = new StringTokenizer(line, "\n");
          while (tokenizer.hasMoreTokens()) {
              StringTokenizer tokenizerLine = new StringTokenizer(tokenizer.nextToken());
              String strName = tokenizerLine.nextToken();
              String strScore = tokenizerLine.nextToken();
              Text name = new Text(strName);
              int score = Integer.parseInt(strScore);
              context.write(name, new IntWritable(score));
          }
      }
  }
  • 編寫Reduce處理
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
      @Override
      protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
          int sum = 0;
          int count = 0;
          Iterator<IntWritable> iterator = values.iterator();
          while (iterator.hasNext()) {
              sum += iterator.next().get();
              count++;
          }
          int average = sum / count;
          context.write(key, new IntWritable(average));
      }
  }
  • main函數
System.setProperty("HADOOP_USER_NAME", "wujinlei");
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://master:9000");
conf.set("mapreduce.app-submission.cross-platform", "true");
conf.set("mapred.jar", "E:\\JackManWu\\hadoo-ptest\\target\\hadoop-test-1.0-SNAPSHOT.jar");
conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
Job job = Job.getInstance(conf, "student_score");
job.setJarByClass(StudentScore.class);//要執行的jar中的類

job.setMapperClass(Map.class);
job.setCombinerClass(Reduce.class);
job.setReducerClass(Reduce.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(job, new Path("hdfs://master:9000/home/wujinlei/work/student/input"));
FileOutputFormat.setOutputPath(job, new Path("hdfs://master:9000/home/wujinlei/work/student/output"));
System.exit(job.waitForCompletion(true) ? 0 : 1);
  • 準備好home/wujinlei/work/student/input輸入文件,參照第一個hadoop程序-WordCount中的建立輸入文件部分,在集羣上預先準備好輸入文件(ps:home/wujinlei/work/student/output不用準備,系統自動生成輸出)
    • 樣例輸入文件:
    陳洲立 67
    陳東偉 98
    李寧 87
    楊森 86
    劉東奇 78
    譚果 94
    蓋蓋 83
    陳洲立 68
    陳東偉 96
    李寧 82
    楊森 85
    劉東奇 72
    譚果 97
    蓋蓋 82
  • 執行main函數,結合hadoop日誌,在任務頁面查看任務執行狀況,檢驗最終生成的結果。
相關文章
相關標籤/搜索