Hadoop 部署
1.準備環境
硬件環境:
(1)因爲分佈式計算須要用到不少機器,部署時用戶須提供多臺機器,至於提供幾臺,須要根據部署計劃肯定。標準方式的部署計劃須要6臺機器。
實際上,徹底模式部署Hadoop時,最低須要兩臺機器(一個主節點,一個從節點)便可實現徹底模式部署,而使用多臺機器部署(一個主節點,多個從節點),會使這種徹底模式體現的更加充分(硬件方面:每臺機器最低要求有1GB內存,20GB硬盤空間)
(2)環境安裝
1.下載VMware Workstation而且安裝。
2.下載CentOS
3.新建CentOS虛擬機:打開VMware Workstation → File→ New Virtual Machine Wizard
→ Typical → Installer disc image file(iso) → 填寫用戶名與密碼(用戶名密碼建議使用joe)→填入機器名 cMaster → 直至Finish。
4.根據第3步的作法分別安裝CentOS系統主機名稱爲cMaster,cSlave0,cSlave1.
軟件環境:
(1)咱們使用Linux較成熟的發行版CentOS部署Hadoop,須注意的是新裝系統(CentOS)的機器不能夠直接部署Hadoop,須作一些基本配置。(修改機器名,添加域名映射,關閉防火牆,安裝JDK)
a.修改機器名
[joe@localhost~]$su - root #切換成root用戶修改機器名
[root@localhost~]#vim /etc/sysconfig/network #編輯存儲機器名文件
將HOSTNAME=localhost.localdomain改成須要的機器名(cMaster/cSlave0/cSlave1)
HOSTNAME=cMaster
b.添加域名映射
查看IP地址 :
[root@cMaster~]#ifconfig
個人IP地址爲192.168.64.139 機器名爲從cMaster,則域名映射爲:192.168.64.139 cMaster
192.168.64.140 cSlave0
192.168.64.141 cSlave1
[root@cMaster~]#vim /etc/hosts
將上面映射加入文件「etc/hosts」中
添加IP地址的映射
192.168.64.139 cMaster
192.168.64.140 cSlave0
192.168.64.141 cSlave1
添加域名映射後,用戶就能夠在cMaster上直接ping另外兩臺機器的機器名了,如:
[root@cMaster~]# ping cSlave0 #在機器cMaster上ping機器cSlave0
c.關閉防火牆
CentOS的防火牆iptables默認狀況下會阻止機器之間通訊,下面爲永久關閉防火牆(其命令重啓後方可生效)
[root@cMaster~]#chkconfig --level 35 iptables off
#永久關閉防火牆,重啓後生效
d.安裝JDK
我用到的版本是: jdk1.8.0_101
須要配置的環境變量
[root@cMaster~]#vim /etc/profile
添加以下文檔:(添加到文件前面)
JAVA_HOME=/home/joe/jdk1.8.0_101
PATH=$JAVA_HOME/bin:$PATH
CLASSPATH=$JAVA_HOME/jre/lib/ext:$JAVA_HOME/lib/tools.jar
export PATH JAVA_HOME CLASSPATHjava
驗證是否正確安裝:
[root@cMaster~]#Java
若是正確安裝則以下圖所示:
制定部署計劃
cMaster做爲主節點,cSlave0,cSlave1做爲從節點
(5)解壓Hadoop。
分別以Joe身份登陸三臺機器,每臺執行以下命令解壓Hadoop文件node
[joe@cMaster~]#tar -zxvf /home/joe/Hadoop-2.7.3.tar.gz
[joe@cSlave0~]#tar -zxvf /home/joe/Hadoop-2.7.3.tar.gz
[joe@cSlave1~]#tar -zxvf /home/joe/Hadoop-2.7.3.tar.gzapache
(6)配置Hadoop。vim
首先編輯命令進入Hadoop-env.sh文件中修改配置信息
vim /home/joe/hadoop-2.7.3/etc/hadoop/hadoop-env.sh
找到:
export JAVA_HOME=${JAVA_HOME}
修改後:
export JAVA_HOME=/home/joe/jdk1.8.0_101
1.編輯文件: vim /home/joe/hadoop-2.7.3/etc/hadoop/core-site.xml 將以下內容嵌入此文件裏的configuration標籤中,和上一個操做相同,三臺機器都須要執行此操做
<configuration>
<property><name>hadoop.tmp.dir</name><value>/home/joe/cloudData</value></property>
<property><name>fs.defaultFS</name><value>hdfs://cMaster:8020</value></property>
</configuration>
2.編輯文件: vim /home/joe/hadoop-2.7.3/etc/hadoop/yarn-site.xml 將以下內容嵌入此文件裏的configuration標籤中,和上一個操做相同,三臺機器都須要執行此操做
<configuration>
<property><name>yarn.resourcemanager.hostname</name><value>cMaster</value></property>
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property>
</configuration>
3. 編輯文件:將文件「/home/joe/hadoop-2.7.3/etc/hadoop/mapred-site.xml.template」重命名爲「/home/joe/hadoop-2.7.3/etc/hadoop/mapred-site.xml」
編輯文件 vim /home/joe/hadoop-2.7.3/etc/hadoop/yarn-site.xml 將以下內容嵌入此文件裏的configuration標籤中,和上一個操做相同,三臺機器都須要執行此操做
<configuration>
<property><name>mapreduce.framework.name</name><value>yarn</value></property>
</configuration>服務器
(7)啓動Hadoop。
首先,在主節點cMaster 上格式化主節點命名空間
[joe@cMaster ~]$ Hadoop-2.7.3/bin/hdfs namenode -format
[joe@cMaster ~]$ hadoop-2.7.3/sbin/hadoop-daemon.sh start namenode
[joe@cMaster ~]$ hadoop-2.7.3/sbin/yarn-daemon.sh start resourcemanager
啓動從服務器:
[joe@cSlave0 ~]$ hadoop-2.7.3/sbin/hadoop-daemon.sh start datanodedom
[joe@cSlave0 ~]$ hadoop-2.7.3/sbin/yarn-daemon.sh start nodemanager分佈式
[joe@cSlave1 ~]$ hadoop-2.7.3/sbin/hadoop-daemon.sh start datanode
[joe@cSlave1 ~]$ hadoop-2.7.3/sbin/yarn-daemon.sh start nodemanageroop
[joe@cSlave1 ~]$ /home/joe/jdk1.8.0_101/bin/jps
2916 NodeManager
2848 DataNode
3001 Jps
例題5.6
[joe@cMaster hadoop-2.7.3]$ bin/hdfs dfs -put /home/joe/hadoop-2.7.3/etc/hadoop/* /in.net
[joe@cMaster hadoop-2.7.3]$ bin/hdfs dfs -mkdir /in
mkdir: Call From cMaster/192.168.64.139 to cMaster:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefusedorm
[joe@cMaster hadoop-2.7.3]$ bin/hdfs dfs -put /home/joe/hadoop-2.7.3/etc/hadoop/* /in
put: Call From cMaster/192.168.64.139 to cMaster:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
[joe@cMaster hadoop-2.7.3]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /in /out/wc-0116/10/02 14:32:54 INFO client.RMProxy: Connecting to ResourceManager at cMaster/192.168.64.139:803216/10/02 14:32:54 ERROR security.UserGroupInformation: PriviledgedActionException as:joe (auth:SIMPLE) cause:java.net.ConnectException: Call From cMaster/192.168.64.139 to cMaster:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefusedjava.net.ConnectException: Call From cMaster/192.168.64.139 to cMaster:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) at org.apache.hadoop.ipc.Client.call(Client.java:1351) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:456) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286) at org.apache.hadoop.examples.WordCount.main(WordCount.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642) at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399) at org.apache.hadoop.ipc.Client.call(Client.java:1318) ... 40 more