今天咱們來實際搭建一下Hadoop 2.2.0版,實戰環境爲目前主流服務器操做系統RedHat6.2系統,本次環境搭建時,各種介質均來自互聯網,在搭建環境以前,請提早準備好各種介質。java
功能node |
Hostnamelinux |
IP地址express |
Namenodeapache |
Master瀏覽器 |
192.168.200.2bash |
Datanode服務器 |
Slave1網絡 |
192.168.200.3app |
Datanode |
Slave2 |
192.168.200.4 |
Datanode |
Slave3 |
192.168.200.5 |
Datanode |
Slave4 |
192.168.200.6 |
軟件 |
版本 |
操做系統 |
RedHat 6.2-64 |
Hadoop |
Hadoop 2.2.0 |
Jdk |
Jdk 1.7-linux |
規劃好服務器用途後,對服務器進行系統安裝,並配置網絡。
此處省略
(1)對操做系統安裝完成後,關閉全部節點的防火牆服務和selinux服務。
service iptablesstop
chkconfigiptables off
cat/etc/selinux/config
# This filecontrols the state of SELinux on the system.
# SELINUX= cantake one of these three values:
# enforcing - SELinux security policy isenforced.
# permissive - SELinux prints warningsinstead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE=can take one of these two values:
# targeted - Targeted processes areprotected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
(2)複製hadoop軟件,jdk軟件到服務器中。
[root@masterhome]# ls
jdk-7u67-linux-x64.rpm hadoop-2.2.0.tar.gz
(3)修改各個服務器主機名和網絡
cat/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave2
cat/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=f2:85:cd:9a:30:0d
NM_CONTROLLED=yes
ONBOOT=yes
IPADDR=192.168.200.4
BOOTPROTO=none
NETMASK=255.255.255.0
TYPE=Ethernet
GATEWAY=192.168.200.254
IPV6INIT=no
USERCTL=no
(4)在各服務器上配置/etc/hosts文件
127.0.0.1 localhost localhost.localdomain localhost4localhost4.localdomain4
::1 localhost localhost.localdomainlocalhost6 localhost6.localdomain6
192.168.200.2 master
192.168.200.3 slave1
192.168.200.4 slave2
192.168.200.5 slave3
192.168.200.6 slave4
通常咱們不會常常使用root用戶運行hadoop,因此要建立一個日常運行和管理Hadoop的用戶;
master和slave節點機都要建立相同的用戶和用戶組,即在全部集羣服務器上都要建hdtest用戶和用戶組;
使用如下命令建立用戶
useradd hdtest
password hdtest
把hadoop-2.2.0.tar.gz拷貝到hdtest用戶下,並修改所屬組。
這次使用的jdk1.7,從官網上下載jdk1.7-linux ,複製到每臺服務器上進行安裝。
使用root用戶安裝
rpm -ivh jdk-7u67-linux-x64.rpm
本次使用的是hdtest用戶安裝hadoop,故須要對hdtest用戶進行配置。
須要在master和slave全部節點上配置環境變量
[root@master ~]# find / -name java
………………
/usr/java/jdk1.7.0_67/bin/java
……………………
[root@master home]# su - hdtest
[hdtest@master ~]$ cat .bash_profile
# .bash_profile
…………
PATH=$PATH:$HOME/bin
export PATH
export JAVA_HOME=/usr/java/jdk1.7.0_67
export PATH=$JAVA_HOME/bin:$PATH
exportCLASSPATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:./
export HADOOP_HOME=/home/hdtest/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin/
exportJAVA_LIBRARY_PATH=/home/hdtest/hadoop-2.2.0/lib/native
在全部節點使用hdtest用戶生成祕鑰
[hdtest@master.ssh]$ ssh-keygen -t rsa
[hdtest@slave1 .ssh]$ ssh-keygen -t rsa
[hdtest@slave2 .ssh]$ ssh-keygen -t rsa
[hdtest@slave3 .ssh]$ ssh-keygen -t rsa
[hdtest@slave4 .ssh]$ ssh-keygen -t rsa
[hdtest@slave2 .ssh]$ ll
total 16
-rw------- 1 hdtest hdtest 1675 Sep 4 14:53 id_rsa
-rw-r--r-- 1 hdtest hdtest 395 Sep 4 14:53 id_rsa.pub
-rw-r--r-- 1 hdtest hdtest 783 Sep 4 14:58 known_hosts
各節點生成公鑰複製到其中一臺機器,並進行合併。
[hdtest@slave1 .ssh]$ scp id_rsa.pub192.168.200.2:/home/hdtest/.ssh/slave1.pub
[hdtest@slave2 .ssh]$ scp id_rsa.pub192.168.200.2:/home/hdtest/.ssh/slave2.pub
[hdtest@slave3 .ssh]$ scp id_rsa.pub192.168.200.2:/home/hdtest/.ssh/slave3.pub
[hdtest@slave4 .ssh]$ scp id_rsa.pub192.168.200.2:/home/hdtest/.ssh/slave4.pub
[hdtest@master .ssh]$ cat *.pub >>authorized_keys
把master上生成的authorized文件分別拷貝到每臺機器上。
scp authorized_keys slave1:/home/hdtest/.ssh/
scp authorized_keys slave2:/home/hdtest/.ssh/
scp authorized_keys slave3:/home/hdtest/.ssh/
scp authorized_keys slave4:/home/hdtest/.ssh/
在全部節點修改文件權限
[hdtest@master ~]$ chmod 700 .ssh/
[hdtest@master .ssh]$ chmod 600authorized_keys
以上步驟完成後,進行測試
[hdtest@master .ssh]$ ssh slave1
Last login: Thu Sep 4 15:58:39 2014 from master
[hdtest@slave1 ~]$ ssh slave3
Last login: Thu Sep 4 15:58:42 2014 from master
[hdtest@slave3 ~]$
使用ssh登錄各服務器,不用輸入密碼,證實配置完成。
在安裝hadoop以前,須要新建幾個目錄
[hdtest@master ~]$ pwd
/home/hdtest
mkdir dfs/name -p
mkdir dfs/data -p
mkdir mapred/local -p
mkdir mapred/system
每臺機器服務器都要配置,且都是同樣的,配置完一臺其餘的只須要拷貝,每臺
機上的core-site.xml和mapred-site.xml都是配master服務器的hostname,由於都是配
置hadoop的入口
[hdtest@master hadoop]$ pwd
/home/hdtest/hadoop-2.2.0/etc/hadoop
[hdtest@master hadoop]$ cat core-site.xml
<?xml version="1.0"encoding="UTF-8"?>
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the"License");
youmay not use this file except in compliance with the License.
Youmay obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS"BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Seethe License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific propertyoverrides in this file. -->
<configuration>
<property>
<name>io,native.lib.available</name>
<value>true</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
<final>true</final>
</property>
</configuration>
[hdtest@master hadoop]$ cat hdfs-site.xml
<?xml version="1.0"encoding="UTF-8"?>
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the"License");
youmay not use this file except in compliance with the License.
Youmay obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS"BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Seethe License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific propertyoverrides in this file. -->
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hdtest/dfs/name</value>
<description>Determines where on the local filesystemthe DFS name node should store the name table.If this is a comma-delimited listof directories,then name table is replicated in all of the directories,forredundancy.</description>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hdtest/dfs/data</value>
<description>Determines where on the local filesystemthe DFS name node should store the name table.If this is a comma-delimited listof directories,then name table is replicated in all of the directories,forredundancy.</description>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>副本數量</description>
</property>
<property>
<name>dfs.permission</name>
<value>false</value>
</property>
</configuration>
[hdtest@master hadoop]$ cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the"License");
youmay not use this file except in compliance with the License.
Youmay obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS"BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Seethe License for the specific language governing permissions and
limitationsunder the License. See accompanying LICENSE file.
-->
<!-- Put site-specific propertyoverrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://master:9001</value>
<final>true</final>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1536</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024M</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>3072</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx2560M</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>100</value>
</property>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>50</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>file:/home/hdtest/mapred/system</value>
<final>true</final>
</property>
<property>
<name>mapred.local.dir</name>
<value>file:/home/hdtest/mapred/local</value>
</property>
</configuration>
[hdtest@master hadoop]$ cat yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the"License");
youmay not use this file except in compliance with the License.
Youmay obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS"BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Seethe License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>
<spanstyle="font-family:Arial,Helvetica,sans-serif">master</span>
<spanstyle="font-family:Arial,Helvetica,sans-serif">:8080</span>
</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8081</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8082</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<!-- Site specificYARN configuration properties -->
</configuration>
修改hadoop-env.sh,vi yarn-env.sh,mapred-env.sh文件
修改如下路徑
export JAVA_HOME=/usr/java/jdk1.7.0_67
只須要配置namemode節點機,這裏的HDM01即作namenode也兼datanode,通常狀況
namenode要求獨立機器,namemode不兼datanode
[hdtest@master hadoop]$ pwd
/home/hdtest/hadoop-2.2.0/etc/hadoop
[hdtest@master hadoop]$ cat masters
192.168.200.2
[hdtest@master hadoop]$ cat slaves
192.168.200.3
192.168.200.4
192.168.200.5
192.168.200.6
以上配置完成後,須要把hadoop目錄分發到各slave節點上。
[hdtest@master ~]$ scp -r hadoop-2.2.0slave1:/home/hdtest/
[hdtest@master ~]$ scp -r hadoop-2.2.0slave2:/home/hdtest/
[hdtest@master ~]$ scp -r hadoop-2.2.0slave3:/home/hdtest/
[hdtest@master ~]$ scp -r hadoop-2.2.0slave4:/home/hdtest/
[hdtest@master bin]$ pwd
在master節點運行如下命令格式化
/home/hdtest/hadoop-2.2.0/bin
[hdtest@master bin]$ ./hadoop namenode –format
出現如下字符表示格式化成功。
從新format時,系統提示以下:
Re-format filesystem in /home/hadoop/tmp/dfs/name ? (Y or N) 必須輸入大寫Y,輸入小寫y不會報輸入錯誤,但format出錯。
使用如下命令啓動hadoop服務,只需在namenode節點啓動。
[hdtest@master sbin]$ pwd
/home/hdtest/hadoop-2.2.0/sbin
[hdtest@master sbin]$ ./start-all.sh (stop-all.sh中止服務)
出現如下字符表示啓動成功,並能夠使用jps命令進行驗證。
[hdtest@master ~]$ netstat -ntpl
[hdtest@master sbin]$ hadoop dfsadmin-report
DEPRECATED: Use of this script to executehdfs command is deprecated.
Instead use the hdfs command for it.
Java HotSpot(TM) 64-Bit Server VM warning:You have loaded library /home/hdtest/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0which might have disabled stack guard. The VM will try to fix the stack guardnow.
It's highly recommended that you fix thelibrary with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/09/05 10:48:23 WARNutil.NativeCodeLoader: Unable to load native-hadoop library for yourplatform... using builtin-java classes where applicable
Configured Capacity: 167811284992 (156.29GB)
Present Capacity: 137947226112 (128.47 GB)
DFS Remaining: 137947127808 (128.47 GB)
DFS Used: 98304 (96 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 4 (4 total, 0 dead)
Live datanodes:
Name: 192.168.200.5:50010 (slave3)
Hostname: slave3
Decommission Status : Normal
Configured Capacity: 41952821248 (39.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7465213952 (6.95 GB)
DFS Remaining: 34487582720 (32.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.21%
Last contact: Fri Sep 05 10:48:23 CST 2014
Name: 192.168.200.3:50010 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 41952821248 (39.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7465467904 (6.95 GB)
DFS Remaining: 34487328768 (32.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.21%
Last contact: Fri Sep 05 10:48:24 CST 2014
Name: 192.168.200.6:50010 (slave4)
Hostname: slave4
Decommission Status : Normal
Configured Capacity: 41952821248 (39.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7467925504 (6.96 GB)
DFS Remaining: 34484871168 (32.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.20%
Last contact: Fri Sep 05 10:48:24 CST 2014
Name: 192.168.200.4:50010 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 41952821248 (39.07 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7465451520 (6.95 GB)
DFS Remaining: 34487345152 (32.12 GB)
DFS Used%: 0.00%
DFS Remaining%: 82.21%
Last contact: Fri Sep 05 10:48:22 CST 2014
經過瀏覽器訪問如下地址http://192.168.200.2:50070/
訪問如下地址http://192.168.200.2:8088/cluster