安裝hadoop+zookeeper ha 前期工做配置好網絡和主機名和關閉防火牆 chkconfig iptables off //關閉防火牆 1.安裝好java並配置好相關變量 (/etc/profile) #java export JAVA_HOME=/usr/java/jdk1.8.0_65 export JRE_HOME=$JAVA_HOME/jre export PATH=$PATH:$JAVA_HOME/bin export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar (最前面要有.) 保存退出 source /etc/profile 2.設置好主機名和網絡映射關係 (/etc/hosts) // hadoop.master爲namenode // hadoop.slaver1/hadoop.slaver2/hadoop.slaver3 爲datanode 192.168.22.241 hadoop.master 192.168.22.242 hadoop.slaver1 192.168.22.243 hadoop.slaver2 192.168.22.244 hadoop.slaver3 3.建立用戶並建立密碼(以root身份登錄) 1. useradd hadoop(或者其餘用戶名) 2. passwd hadoop (回車輸入密碼 兩次) 3. su hadoop (使用hadoop用戶登錄) 4.免密碼登錄 1.安裝ssh 具體百度 通常都自帶有 2.建立在家目錄底下建立.ssh目錄(使用hadoop用戶) mkdir ~/.ssh 3.建立公鑰(namenode端運行) ssh-keygen -t rsa 一路回車 最後會在~/.ssh目錄下生成id_rsa、id_rsa.pub 其中前者是密鑰 後者是公鑰 4.將id_rsa.pub文件拷貝到slaver節點的相同用戶.ssh目錄下 scp -r id_rsa.pub 用戶名@主機名:目標文件(含路徑) 5.在各個子節點執行cat id_rsa.pub >> ~/.ssh/authorized_keys 6.設置權限 chmod 600 authorized_keys cd .. chmod 700 -R .ssh 7.注意此時還不能免密碼 需在master 節點運行ssh slaver 輸入密碼後才能免密碼 5.安裝zookeeper(三臺 master slaver1 slaver2) 1.下載安裝包 2.解壓安裝包 tar zxvf zookeeper-3.4.7.tar.gz 3.配置環境變量 #zookeeper export ZOOKEEPER_HOME=/opt/zookeeper-3.4.7 export PATH=$PATH::$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/conf 保存退出 source /etc/profile 4.修改配置文件 cp zoo_sample.cfg zoo.cfg vim zoo.cfg ####zoo.cfg#### tickTime=2000 initLimit=10 syncLimit=5 dataDir=/opt/zookeeper-3.4.7/tmp/zookeeper (注意建立相關目錄) clientPort=2181 server.1=hadoop.master:2888:3888 server.2=hadoop.slaver1:2888:3888 server.3=hadoop.slaver2:2888:3888 參數說明: tickTime: zookeeper中使用的基本時間單位, 毫秒值. dataDir: 數據目錄. 能夠是任意目錄. dataLogDir: log目錄, 一樣能夠是任意目錄. 若是沒有設置該參數, 將使用和dataDir相同的設置. clientPort: 監聽client鏈接的端口號. initLimit: zookeeper集羣中的包含多臺server, 其中一臺爲leader, 集羣中其他的server爲follower. syncLimit: 該參數配置leader和follower之間發送消息, 請求和應答的最大時間長度. server.X=A:B:C 其中X是一個數字, 表示這是第幾號server. A是該server所在的IP地址. B配置該server和集羣中的leader交換消息所使用的端口. C配置選舉leader時所使用的端口. 5.分發到各個節點中 scp -r /opt/zookeeper-3.4.7 hadoop@主機名:/opt 6.根據dataDir配置的目錄下新建myid文件, 寫入一個數字, 該數字表示這是第幾號server cd /opt/zookeeper-3.4.7/tmp/zookeeper touch myid(若是是安裝上述配置,則master爲1 slaver1爲2 slaver3) 7.經常使用命令 ####啓動/關閉/查看 zk##### zkServer.sh start //集羣中每臺主機執行一次 zkServer.sh stop zkServer.sh status ####查看/刪除節點信息#### zkCli.sh ls / rmr /節點名稱 6.安裝hadoop(四臺機子 master slaver1 slaver2 slaver3 其中namenode有master和slaver1) 1.下載安裝包 2.解壓安裝包 3.配置環境變量 #hadoop export HADOOP_HOME=/opt/hadoop-2.5.2 export HADOOP_PREFIX=/opt/hadoop-2.5.2 export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib export CLASSPATH=.:$CLASSPATH:$HADOOP_HOME/bin 保存退出 source /etc/profile 4.修改配置文件 1.建立相關目錄 cd /opt/hadoop-2.5.2 mkdir logs mkdir tmp 2.修改相關配置文件相關參數(core-site.xml/hadoop-env.sh/hdfs-site.xml/log4j.properties /mapred-env.sh/mapred-site.xml/masters/slaves/yarn-env.sh/yarn-site.xml) ####core-site.xml#### <configuration> <!-- 指定hdfs的nameservice爲namenode--> <property> <name>fs.defaultFS</name> <value>hdfs://ns1:8020</value> </property> <!-- 指定hadoop塊大小 --> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <!-- 指定hadoop臨時目錄 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/hadoop-2.5.2/tmp</value> <description>A base for other temporary directories.</description> </property> <!-- 指定zookeeper地址 --> <property> <name>ha.zookeeper.quorum</name> <value>hadoop.master:2181,hadoop.slaver1:2181,hadoop.slaver2:2181</value> </property> </configuration> ####hadoop-env.sh#### export JAVA_HOME=/usr/java/jdk1.8.0_65 export HADOOP_CLASSPATH=.:$HADOOP_CLASSPATH:$HADOOP_HOME/bin export CLASSPATH=.:$CLASSPATH:$HADOOP_HOME/bin ####hdfs-site.xml#### <configuration> <property> <name>dfs.namenode.http-address</name> <value>hadoop.master:50070</value> <description>The address and the base port where the dfs namenode web ui will listen on.</description> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop.slaver1:50070</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>file://${hadoop.tmp.dir}/dfs/namesecondary</value> <final>true</final> </property> <property> <name>dfs.namenode.name.dir</name> <value>file://${hadoop.tmp.dir}/dfs/name</value> <final>true</final> </property> <property> <name>dfs.datanode.data.dir</name> <value>file://${hadoop.tmp.dir}/dfs/data</value> <final>true</final> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> <property> <name>dfs.namenode.hosts.exclude</name> <value>/opt/hadoop-2.5.2/other/excludes</value> <description>Names a file that contains a list of hosts that are not permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, no hosts are excluded.</description> </property> <property> <name>dfs.namenode.hosts</name> <value>/opt/hadoop-2.5.2/etc/hadoop/slaves</value> </property> <property> <name>dfs.blocksize</name> <value>134217728</value> </property> <!-- HBase configuration--> <property> <name>dfs.datanode.max.xcievers</name> <value>4096</value> </property> <!--Zookeeper configuration--> <property> <name>dfs.nameservices</name> <value>ns1</value> </property> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>hadoop.master:8020</value> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>hadoop.slaver1:8020</value> </property> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>hadoop.master:50070</value> </property> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>hadoop.slaver1:50070</value> </property> <property> <name>dfs.namenode.servicerpc-address.ns1.nn1</name> <value>hadoop.master:53310</value> </property> <property> <name>dfs.namenode.servicerpc-address.ns1.nn2</name> <value>hadoop.slaver1:53310</value> </property> <!-- 指定JournalNode在本地磁盤存放數據的位置 --> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/zookeeper-3.4.7/journal</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop.master:8485;hadoop.slaver1:8485;hadoop.slaver2:8485/ns1</value> </property> <!-- 開啓NameNode失敗自動切換 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 配置失敗自動切換實現方式 --> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 指定zookeeper地址 --> <property> <name>ha.zookeeper.quorum</name> <value>hadoop.master:2181,hadoop.slaver1:2181,hadoop.slaver2:2181</value> </property> <!-- 配置隔離機制方法,多個機制用換行分割,即每一個機制暫用--> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <!-- 配置sshfence隔離機制超時時間 --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> </configuration> ####log4j.properties#### hadoop.root.logger=INFO,console hadoop.log.dir=/opt/hadoop-2.5.2/logs hadoop.log.file=hadoop.log ####mapred-env.sh#### export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=1000 export HADOOP_MAPRED_ROOT_LOGGER=INFO,RFA ####mapred-site.xml#### <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.application.classpath</name> <value> /opt/hadoop-2.5.2/etc/hadoop, /opt/hadoop-2.5.2/share/hadoop/common/*, /opt/hadoop-2.5.2/share/hadoop/common/lib/*, /opt/hadoop-2.5.2/share/hadoop/hdfs/*, /opt/hadoop-2.5.2/share/hadoop/hdfs/lib/*, /opt/hadoop-2.5.2/share/hadoop/mapreduce/*, /opt/hadoop-2.5.2/share/hadoop/mapreduce/lib/*, /opt/hadoop-2.5.2/share/hadoop/yarn/*, /opt/hadoop-2.5.2/share/hadoop/yarn/lib/* </value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop.master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop.master:19888</value> </property> <property> <name>mapreduce.jobhistory.done-dir</name> <value>/history/done</value> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>/history/done_intermediate</value> </property> </configuration> ####masters#### hadoop.slaver1 //存儲secondary namenode節點主機名 ####slaves#### hadoop.slaver1 hadoop.slaver2 hadoop.slaver3 ####yarn-env.sh#### export JAVA_HOME=/usr/java/jdk1.8.0_65 ####yarn-site.xml#### <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.address</name> <value>hadoop.master:18040</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hadoop.master:18030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hadoop.master:18025</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>hadoop.master:18041</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>hadoop.master:8088</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/opt/hadoop-2.5.2/other/mynode</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/opt/hadoop-2.5.2/other/logs</value> </property> <property> <name>yarn.nodemanager.log.retain-seconds</name> <value>10800</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/opt/hadoop-2.5.2/other/logs</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir-suffix</name> <value>logs</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>-1</value> </property> <property> <name>yarn.log-aggregation.retain-check-interval-seconds</name> <value>-1</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!--zookeeper--> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yrc</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>hadoop.master</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>hadoop.slaver1</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop.master:2181,hadoop.slaver1:2181,hadoop.slaver2:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> 5.分發到各個節點中 scp -r /opt/hadoop-2.5.2 hadoop@hadoop.master:/opt 6.首次啓動 6.1 啓動zk zkServer.sh start(zk 各個節點執行) 6.2 啓動journalnode hadoop-daemon.sh start journalnode(zk 各個節點執行) 6.3 格式化Namenode hadoop namenode -format(namenode 節點運行 注意是hadoop 不是hdfs) 6.4 啓動Namenode hadoop-daemon.sh start namenode(namenode 節點運行) 6.5 格式化另外一個Namenode hadoop namenode -bootstrapStandby(在secondary namenode節點運行) 6.6 格式化zk hdfs zkfc -formatZK (namenode節點執行) 6.7 將全部的服務中止 stop-all.sh 注意此時需在每一個zk節點執行 zkServer.sh stop 7.正常啓動 1.啓動zk zkServer.sh start(zk 各個節點執行) 2.啓動全部服務 start-all.sh //或者先執行start-dfs.sh 再執行start-yarn.sh 3.啓動後臺歷史服務 mr-jobhistory-daemon.sh start historyserver(在namenode節點執行便可) 4.啓動備份resourcemanger yarn-daemon.sh start resourcemanager //在備份節點運行 5.啓動備份namenode hadoop-daemon.sh start namenode //在備份節點運行 8.驗證 1.jps驗證 查看相關進程 2.web驗證 hdfs 主機名:50070 yarn 主機名:8088 history 主機名:19888 //以上主機名均指 namenode節點主機名 (此時namenode節點是active狀態) 3.查看active狀態 hdfs web查看 有active狀態和stangby狀態兩種 yarn shell命令查看 yarn rmadmin -getServiceState rm1(或者rm2) //其中rm1/rm2爲配置文件中配置的名稱 4.kill當前active的namenode 看能不本身切換到standby namenode上 9.常見命令 ####啓動/關閉yarn jobhistory記錄#### web: //namenode:19888 //其中namenode 爲集羣任意節點主機名 mr-jobhistory-daemon.sh start historyserver //集羣中每臺主機執行一次 mr-jobhistory-daemon.sh stop historyserver ####啓動/關閉/查看 zk##### zkServer.sh start //集羣中每臺主機執行一次 zkServer.sh stop zkServer.sh status ####啓動/關閉/查看 yarn#### yarn-daemon.sh start resourcemanager yarn-daemon.sh stop resourcemanager yarn-daemon.sh stop nodemanager yarn rmadmin -getServiceState rm2 //其中rm2是集羣配置的別名 web: //namenode:8088 //其中namenode是active狀態的主機名 ####啓動/關閉/查看 hadoop#### hadoop-daemon.sh start namenode hadoop-daemon.sh stop namenode hadoop-daemon.sh stop datanode web: //namenode:50070 //其中namenode是active狀態的主機名 ####格式化zkNode#### hdfs zkfc -formatZK //namenode節點執行 注意是hdfs 不是hadoop ####啓動/關閉zkNode##### hadoop-daemon.sh start zkfc hadoop-daemon.sh stop zkfc ####查看/刪除job#### hadoop job -list hadoop job -kill 任務ID //注意不是applicationID ####初始化Journal Storage Directory#### hdfs namenode -initializeSharedEdits //非ha轉成ha時執行 若是一開始已是ha了無需執行 ####初始化namenode#### hadoop namenode -format //namenode端執行 hdfs namenode -bootstrapStandby //secend namenode端執行 執行前需保證namenode已經啓動 10.常見異常 1.Journal Storage Directory /opt/zookeeper-3.4.7/journal/ns1 not formatted 緣由:因爲以前hadoop沒部署ha,改爲ha後造成錯誤 解決辦法: 1.將配置文件hdfs-site.xml中dfs.journalnode.edits.dir對應的目錄刪除 2.hdfs namenode -initializeSharedEdits(namenode 執行) 2.datanode起來了,namenode起不來 解決辦法: 1.查看配置文件相關配置項是否配置正確 2.查看環境變量是否配置正確 3.查看主機網絡映射是否配置正確 4.是否二次格式化namenode 若是是,則須要將datanode 的clusterID和namespaceID改爲namenode一致 目錄通常是tmp目錄下 5.重啓hdfs 6.若是執行上述還不行,則在hadoop服務運行狀態下將tmp目錄下全部文件夾刪除,再格式化,重啓服務 3.兩個namenode起來了,但都是standby狀態 解決辦法: 1.是否均啓動zk 2.格式化zfkc hdfs zkfc -formatZK 3.全部服務重啓(含zk)