hadoop搭建與eclipse開發環境設置

時間 2019-11-11

標籤 hadoop 搭建 eclipse 開發環境設置欄目 Hadoop 简体版

原文原文鏈接

hadoop搭建與eclipse開發環境設置（轉）

1. Windows下eclipse開發環境配置

1.1 安裝開發hadoop插件

將hadoop安裝包hadoop\contrib\eclipse-plugin\hadoop-0.20.2-eclipse-plugin.jar拷貝到eclipse的插件目錄plugins下。java

　　須要注意的是插件版本（及後面開發導入的全部jar包）與運行的hadoop一致，不然可能會出現EOFException異常。node

重啓eclipse，打開windows->open perspective->other->map/reduce 能夠看到map/reduce開發視圖。apache

1.2 設置鏈接參數

打開windows->show view->other-> map/reduce Locations視圖，在點擊大象後彈出的對話框（General tab）進行參數的添加：windows

參數說明以下：安全

Location name:任意服務器

map/reduce master：與mapred-site.xml裏面mapred.job.tracker設置一致。app

DFS master：與core-site.xml裏fs.default.name設置一致。eclipse

User name: 服務器上運行hadoop服務的用戶名。ide

而後是打開「Advanced parameters」設置面板，修改相應參數。上面的參數填寫之後，也會反映到這裏相應的參數：函數

主要關注下面幾個參數：

fs.defualt.name：與core-site.xml裏fs.default.name設置一致。

mapred.job.tracker：與mapred-site.xml裏面mapred.job.tracker設置一致。

dfs.replication：與hdfs-site.xml裏面的dfs.replication一致。

hadoop.tmp.dir：與core-site.xml裏hadoop.tmp.dir設置一致。

hadoop.job.ugi：並非設置用戶名與密碼。是用戶與組名，因此這裏填寫hadoop,hadoop。

說明：第一次設置的時候多是沒有hadoop.job.ugi和dfs.replication參數的，沒關係，確認保存。打開Project Explorer中DFS　Locations目錄，應該能夠年看到文件系統中的結構了。可是在/hadoop/mapred/system下卻沒有查看權限，以下圖：

並且刪除文件的時候也會報錯：

這個緣由是我使用地本用戶Administrator（我是用管理員用戶登錄來地windows系統的）進行遠程hadoop系統操做，沒有權限。

此時再打開「Advanced parameters」設置面板，應該能夠看到hadoop.job.ugi了，這個參數默認是本地操做系統的用戶名，若是不幸與遠程hadoop用戶不一致，那就要改過來了，將hadoop加在第一個，並用逗號分隔。如：

　　保存配置後，從新啓動eclipse。/hadoop/mapred/system下就一目瞭然了，刪除文件也OK。

1.3 運行hadoop程序

首先將hadoop安裝包下面的全部jar包都導到eclipse工程裏。

而後創建一個類：DFSOperator.java，該類寫了四個基本方法：建立文件，刪除文件，把文件內容讀爲字符串，將字符串寫入文件。同時有個main函數，能夠修改測試:

package com.kingdee.hadoop;

import java.io.BufferedReader;

import java.io.IOException;

import java.io.InputStream;

import java.io.InputStreamReader;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FSDataOutputStream;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

/**

* The utilities to operate file on hadoop hdfs.

* @author luolihui 2011-07-18

public class DFSOperator {

private static final String ROOT_PATH = "hdfs:///";

private static final int BUFFER_SIZE = 4096;

/**

* construct.

public DFSOperator(){}

/**

* Create a file on hdfs.The root path is /.<br>

* for example: DFSOperator.createFile("/lory/test1.txt", true);

* @param path the file name to open

* @param overwrite if a file with this name already exists, then if true, the file will be

* @return true if delete is successful else IOException.

* @throws IOException

public static boolean createFile(String path, boolean overwrite) throws IOException

{

//String uri = "hdfs://192.168.1.100:9000";

//FileSystem fs1 = FileSystem.get(URI.create(uri), conf);

Configuration conf = new Configuration();

FileSystem fs = FileSystem.get(conf);

Path f = new Path(ROOT_PATH + path);

fs.create(f, overwrite);

fs.close();

return true;

}

/**

* Delete a file on hdfs.The root path is /. <br>

* for example: DFSOperator.deleteFile("/user/hadoop/output", true);

* @param path the path to delete

* @param recursive if path is a directory and set to true, the directory is deleted else throws an exception. In case of a file the recursive can be set to either true or false.

* @return true if delete is successful else IOException.

* @throws IOException

public static boolean deleteFile(String path, boolean recursive) throws IOException

{

//String uri = "hdfs://192.168.1.100:9000";

//FileSystem fs1 = FileSystem.get(URI.create(uri), conf);

Configuration conf = new Configuration();

FileSystem fs = FileSystem.get(conf);

Path f = new Path(ROOT_PATH + path);

fs.delete(f, recursive);

fs.close();

return true;

}

/**

* Read a file to string on hadoop hdfs. From stream to string. <br>

* for example: System.out.println(DFSOperator.readDFSFileToString("/user/hadoop/input/test3.txt"));

* @param path the path to read

* @return true if read is successful else IOException.

* @throws IOException

public static String readDFSFileToString(String path) throws IOException

{

Configuration conf = new Configuration();

FileSystem fs = FileSystem.get(conf);

Path f = new Path(ROOT_PATH + path);

InputStream in = null;

String str = null;

StringBuilder sb = new StringBuilder(BUFFER_SIZE);

if (fs.exists(f))

{

in = fs.open(f);

BufferedReader bf = new BufferedReader(new InputStreamReader(in));

while ((str = bf.readLine()) != null)

{

sb.append(str);

sb.append("\n");

}

in.close();

bf.close();

fs.close();

return sb.toString();

}

else

{

return null;

}

/**

* Write string to a hadoop hdfs file. <br>

* for example: DFSOperator.writeStringToDFSFile("/lory/test1.txt", "You are a bad man.\nReally!\n");

* @param path the file where the string to write in.

* @param string the context to write in a file.

* @return true if write is successful else IOException.

* @throws IOException

public static boolean writeStringToDFSFile(String path, String string) throws IOException

{

Configuration conf = new Configuration();

FileSystem fs = FileSystem.get(conf);

FSDataOutputStream os = null;

Path f = new Path(ROOT_PATH + path);

os = fs.create(f,true);

os.writeBytes(string);

os.close();

fs.close();

return true;

}

public static void main(String[] args)

{

try {

DFSOperator.createFile("/lory/test1.txt", true);

DFSOperator.deleteFile("/dfs_operator.txt", true);

DFSOperator.writeStringToDFSFile("/lory/test1.txt", "You are a bad man.\nReally?\n");

System.out.println(DFSOperator.readDFSFileToString("/lory/test1.txt"));

} catch (IOException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

System.out.println("===end===");

}

而後Run AsàRun on HadoopàChoose an exitsing server from the list belowàfinish.

結果很簡單（那個警告無論）：

11/07/16 18:44:32 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively

You are a bad man.

Really?

===end===

也能夠運行hadoop自帶的WorkCount程序，找到其源代碼導進來，而後設置輸入輸出參數，而後一樣「Run on hadoop」。具體步驟再也不示範。

每「Run on hadoop」都會在workspace\.metadata\.plugins\org.apache.hadoop.eclipse下生成臨時jar包。不過第一次須要Run on hadoop，之後只須要點擊那運行的綠色按鈕了。

2. 錯誤及處理

2.1 安全模式問題

我在eclipse上刪除DFS上的文件夾時，出現下面錯誤：

錯誤提示說得也比較明示，是NameNode在安全模式中，其解決方案也一併給出。

相似的運行hadoop程序時，有時候會報如下錯誤：

org.apache.hadoop.dfs.SafeModeException: Cannot delete /user/hadoop/input. Name node is in safe mode

解除安全模式：

bin/hadoop dfsadmin -safemode leave

用戶能夠經過dfsadmin -safemode value 來操做安全模式，參數value的說明以下：

enter - 進入安全模式

leave - 強制NameNode離開安全模式

get - 返回安全模式是否開啓的信息

wait - 等待，一直到安全模式結束。

2.2 開發時報錯Permission denied

org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=Administrator, access=WRITE, inode="test1.txt":hadoop:supergroup:rw-r--r--

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)

at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)

at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2710)

at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:492)

at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:195)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)

at com.kingdee.hadoop.DFSOperator.createFile(DFSOperator.java:46)

at com.kingdee.hadoop.DFSOperator.main(DFSOperator.java:134)

解決方法是，在「Advanced parameters」設置面板，設置hadoop.job.ugi參數，將hadoop用戶加上去。

變爲：

而後從新在運行中」Run on hadoop」。

另外一方法是改變要操做的文件的權限。

Permission denied: user=Administrator, access=WRITE, inode="test1.txt":hadoop:supergroup:rw-r--r--

　　上面的意思是：test1.txt文件的訪問權限是rw-r--r--，歸屬組是supergroup，歸屬用戶是hadoop，如今使用Administrator用戶對test1.txt文件進行WRITE方式訪問，被拒絕了。

因此能夠改變下test1.txt文件的訪問權限：

$ hadoop fs –chmod 777 /lory/test1.txt

$ hadoop fs –chmod 777 /lory #或者上一級文件夾

　　固然使用-chown命令也能夠。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。