Hadoop上路_10-分佈式Hadoop集羣搭建

1.建立模板系統:  

        參照前文。本例使用ubuntu10.10。初始化用戶hadoop,密碼dg,主機名hadoop-dg 。   html

        1)解壓jdkhadoop,配置bin到環境變量:        

jdk1.7.0_17   
hadoop-1.1.2

        2)配置%hadoop%/conf/hadoop-env.sh

export JAVA_HOME=/jdk目錄

        3)安裝openssh,尚不配置無密碼登錄:       

openssh-client  
openssh-service  
openssh-all

        4)配置網卡靜態IPhostnamehosts:         

#查看網卡名稱
ifconfig

#配置網卡參數
sudo gedit /etc/network/interfaces
auto eth0   
iface eth0 inet static   
address 192.168.1.251   
gateway 192.168.1.1   
netmask 255.255.255.0
#重啓網卡
sudo /etc/init.d/networking restart
#修改hostname
sudo gedit /etc/hostname
hadoop-dg
#修改hosts
sudo gedit /etc/hosts

127.0.0.1 hadoop-dg
192.168.1.251 hadoop-dg

        5)賦予用戶對hadoop安裝目錄可寫的權限:           

sudo chown -hR hadoop hadoop-1.1.2

        6)關閉防火牆:   

sudo ufw disable

#查看狀態
sudo ufw status
#開啓
#sudo ufw enable
   

2. 根據模板系統複製出dg1dg2dg3虛擬系統:   

        

#在Windows命令行進入virtualbox安裝目錄,執行命令,最後參數爲vdi文件路徑
%virtualbox%> VBoxManage internalcommands sethduuid %dg2.vdi%

        1)使用dg1建立主控機(namenodejobtracker節點所在):     

                (1)配置網卡靜態IP:     

    

#配置網卡參數
sudo gedit /etc/network/interfaces

auto eth2   
iface eth2 inet static   
address 192.168.1.251   
gateway 192.168.1.1   
netmask 255.255.255.0
   
#重啓網卡
sudo /etc/init.d/networking restart

                (2)配置hostname:     

sudo gedit /etc/hostname
master

                (3)配置hosts:     

sudo gedit /etc/hosts
127.0.0.1 master
192.168.1.251 master   
192.168.1.252 slave1   
192.168.1.253 slave2

   

                配置hostnamehosts後重啓系統。    java

                (4)配置hadoop:     

                ① %hadoop%/conf/core-site.xml:  node

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->
<configuration>
	<property>
		<name>fs.default.name</name>
		<value>hdfs://master:9000</value>
	</property>
  	<property>
  		<name>hadoop.tmp.dir</name>
               <!-- 當前用戶需要對此目錄有讀寫權限 -->
  		<value>/home/hadoop/hadoop-${user.name}</value> 
  	</property> 
</configuration>

  

                ② %hadoop%/conf/hdfs-site.xml:         shell

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  
<!-- Put site-specific property overrides in this file. -->
<configuration>
  <property>
  		<name>dfs.replication<name>
  <!-- 當前有兩個附屬機 -->
  		<value>2</value>    
  	</property>      
</configuration>

  

                 ③ %hadoop%/conf/mapred-site.xml:         apache

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  
<!-- Put site-specific property overrides in this file. -->
<configuration>
  	<property>
  		<name>mapred.job.tracker</name>
  		<value>master:9001</value>
  	</property>
</configuration>

  

                ④ %hadoop%/conf/masters:   
ubuntu

master

    

                ⑤ %hadoop%/conf/slaves:    
網絡

slave1
slave2

      

                ⑥ 建立附屬機後執行:  dom

#拷貝所有配置文件到附屬機slave1
scp -rpv ~/hadoop-1.1.2/conf/* slave1:~/hadoop-1.1.2/conf/

#拷貝所有配置文件到附屬機slave2
scp -rpv ~/hadoop-1.1.2/conf/* slave2:~/hadoop-1.1.2/conf/

                首先:更改兩臺附屬機的網絡配置,ping通對方及主控機:  ssh

                而後:在master主控機執行操做:  ide

  

                而後能夠看到slave1/slave2附屬機上的文件了:


                (5)建立空密碼公共密鑰:      

ssh-keygen

  

讀取id_rsa.pub文件中的密碼內容保存到新文件
sudo cat id_rsa.pub >> authorized_keys

  

  

                首先:在附屬機上建立~/.ssh目錄:

#進入用戶根目錄
cd ~/   

#建立.ssh目錄
sudo mkdir .ssh  

#得到.ssh目錄的權限
sudo chown -hR hadoop .ssh

                而後:在主控機拷貝公共密鑰到從機:  

#建立附屬機後執行   

#拷貝主控機的空密碼密鑰到附屬機slave1
scp authorized_keys slave1:~/.ssh/    

#拷貝主控機的空密碼密鑰到附屬機slave2
scp authorized_keys slave2:~/.ssh/

 
                從主控機登錄附屬機,驗證ssh:   

#登錄附屬機slave1
ssh slave1  

exit

#登錄附屬機slave2
ssh slave2

 

        2)使用dg2dg3建立附屬機(datanodetasktracker節點所在):           

                (1)配置網卡靜態IP:     

#查看網卡配置
ifconfig


#配置網卡參數
sudo gedit /etc/network/interfaces
auto eth1 
iface eth1 inet static   
address 192.168.1.252   
gateway 192.168.1.1   
netmask 255.255.255.0

auto eth1   
iface eth1 inet static   
address 192.168.1.253   
gateway 192.168.1.1   
netmask 255.255.255.0

#重啓網卡
sudo /etc/init.d/networking restart

                (2)配置hostname:     

sudo gedit /etc/hostname
slave1 slave2

                (3)配置hosts:     

sudo gedit /etc/hosts
127.0.0.1 slave1
192.168.1.251 master   
192.168.1.252 slave1   
192.168.1.253 slave2

127.0.0.1 slave2
192.168.1.251 master   
192.168.1.252 slave1   
192.168.1.253 slave2


                配置hostnamehosts後重啓系統。   

                (4)配置hadoop:     

                從主控機拷貝hadoop配置文件:  

scp -rpv ~/hadoop-1.1.2/conf/* slave1:~/hadoop-1.1.2/conf/    

scp -rpv ~/hadoop-1.1.2/conf/* slave2:~/hadoop-1.1.2/conf/

                (5)拷貝主控機的空密碼公共密鑰:      

                首先:在附屬機建立~/.ssh目錄:

cd ~/   

sudo mkdir .ssh  

sudo chown -hR hadoop .ssh

               而後:在主控機拷貝公共密鑰到附屬機:  

scp authorized_keys slave1:~/.ssh/    

scp authorized_keys slave2:~/.ssh/

              從主控機登錄附屬機,進行驗證   

ssh slave1  

ssh slave2


3. 啓動集羣:          

        1在主控機(master)格式化HDFS:   

hadoop namenode -format

        2)在主控機啓動Hadoop:  

start-all.sh

 

        3)驗證1-查看hadoop.tmp.dir目錄:  

啓動 hadoop成功後,在masterhadoop.tmp.dir指定的目錄中生成了dfs文件夾,slave的hadoop.tmp.dir指定的目錄中均生成了dfs文件夾和mapred文件夾。 

        4驗證2-查看dfs報告:  

cd ~/hadoop-1.1.2/bin/   

hadoop dfsadmin -report

                錯誤:
                      可用的datanodes啓動0: 
   
                查看jps,正常:
  

                查看:%hadoop%/logs/hadoop-hadoop-namenode-master.log日誌信息:

2013-05-22 17:08:27,164 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hadoop cause:java.io.IOException: File /home/hadoop/hadoop-hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
2013-05-22 17:08:27,164 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000, call addBlock(/home/hadoop/hadoop-hadoop/mapred/system/jobtracker.info, DFSClient_NONMAPREDUCE_592417942_1, null) from 127.0.0.1:47247: error: java.io.IOException: File /home/hadoop/hadoop-hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /home/hadoop/hadoop-hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:736)
	at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

                    解決1:   
                            配置 master 主控機和兩臺附屬機的 hosts 爲:     
127.0.0.1	localhost.localdomain	localhost
::1		localhost6.localdomain6	localhost6
  
192.168.1.251	master
192.168.1.252	slave1
192.168.1.253	slave2

                  重啓機器。重試,沒起做用。

                
解決2
                   
1. 在slave1執行ssh-keygen,將生成的id_rsa.pub追加到已有的由master最初生成的authorized_keys。此時密鑰文件中保存了兩臺機器的rsa密碼;   

#在slave1上生成本身的id_rsa.pub密鑰文件
ssh-keygen

#使用 >> 追加進到已有的文件中
sudo cat id_rsa.pub >> authorized_keys

                      
2. 在slave2上將~/.ssh目錄中已有的authorized_keys刪除;  

                       3. 在slave1上將authorized_keys空密碼公共密鑰遠程複製到slave2;   
scp authorized_keys slave2:~/.ssh/

                      
4. 在slave2執行ssh-keygen,將生成的id_rsa.pub追加到authorized_keys。此時密鑰文件保存了三臺機器的rsa密碼;  

                       5. 在slave2上將authorized_keys空密碼公共密鑰遠程複製到master和slave1替換原有的文件;  

                       6. 在master上登錄slave1和slave2和master;   

                       7. 在slave1上登錄slave1和slave2和master;   

                       8. 在slave2上登錄slave1和slave2和master;   

                       9. hadoop dfsadmin -report          

                 成功:
hadoop@master:~/hadoop-1.1.2/bin$ hadoop dfsadmin -report
Configured Capacity: 16082411520 (14.98 GB)
Present Capacity: 9527099422 (8.87 GB)
DFS Remaining: 9527042048 (8.87 GB)
DFS Used: 57374 (56.03 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 2 (2 total, 0 dead)

Name: 192.168.1.252:50010
Decommission Status : Normal
Configured Capacity: 8041205760 (7.49 GB)
DFS Used: 28687 (28.01 KB)
Non DFS Used: 3277914097 (3.05 GB)
DFS Remaining: 4763262976(4.44 GB)
DFS Used%: 0%
DFS Remaining%: 59.24%
Last contact: Wed May 22 22:54:29 CST 2013


Name: 192.168.1.253:50010
Decommission Status : Normal
Configured Capacity: 8041205760 (7.49 GB)
DFS Used: 28687 (28.01 KB)
Non DFS Used: 3277398001 (3.05 GB)
DFS Remaining: 4763779072(4.44 GB)
DFS Used%: 0%
DFS Remaining%: 59.24%
Last contact: Wed May 22 22:54:30 CST 2013

        5)驗證3-網頁監視:  

               http://192.168.1.251:50030  

               http://192.168.1.251:50070  
   
- end 

相關文章
相關標籤/搜索