防僞碼:大鵬一日同風起,扶搖直上九萬里。java
1、環境介紹node
操做平臺:物理機
操做系統:CentOS 6.5
軟件版本:hadoop-2.5.2,hbase-1.1.2-bin,jdk-7u79-linux-x64,protobuf-2.5.0,snappy-1.1.1,zookeeper-3.4.6,hadoop-snappy-0.0.1-SNAPSHOT
軟件部署用戶:hadoop
軟件放置位置:/opt/soft
軟件安裝位置:/opt/server
軟件數據位置:/opt/data
軟件日誌位置:/opt/var/logslinux
主機名 | IP地址 | Hadoop進程 |
INVOICE-GL-01 | 10.162.16.6 | QuorumPeerMain ,HMaster |
INVOICE-GL-02 | 10.162.16.7 | QuorumPeerMain ,HMaster |
INVOICE-GL-03 | 10.162.16.8 | QuorumPeerMain ,HMaster |
INVOICE-23 | 10.162.16.227 | NameNode, DFSZKFailoverController |
INVOICE-24 | 10.162.16.228 | NameNode, DFSZKFailoverController |
INVOICE-25 | 10.162.16.229 | JournalNode, DataNode, HRegionServer |
INVOICE-26 | 10.162.16.230 | JournalNode, DataNode, HRegionServer |
INVOICE-27 | 10.162.16.231 | JournalNode, DataNode, HRegionServer |
INVOICE-DN-01 | 10.162.16.232 | DataNode, HRegionServer |
INVOICE-DN-02 | 10.162.16.233 | DataNode, HRegionServer |
INVOICE-DN-03 | 10.162.16.234 | DataNode, HRegionServer |
INVOICE-DN-04 | 10.162.16.235 | DataNode, HRegionServer |
INVOICE-DN-05 | 10.162.16.236 | DataNode, HRegionServer |
INVOICE-DN-06 | 10.162.16.237 | DataNode, HRegionServer |
2、安裝步驟c++
一、關閉防火牆和selinux
##sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config
##setenforce 0
##chkconfig iptables off
##/etc/init.d/iptables stopweb
二、時間服務器同步
##vim /etc/ntp.conf
server NTP服務器
driftfile /var/lib/ntp/drift
logfile /var/log/ntp
##/etc/init.d/ntpd startshell
三、全部機器修改hosts主機名
四、全部機器建立hadoop用戶和相應的文件夾
##useradd hadoop
##echo "hadoop" | passwd --stdin hadoop
##mkdir -p /opt/server
##mkdir -p /opt/soft
##mkdir -p /opt/data
##mkdir -p /opt/var/logs
##chown -R hadoop:hadoop /opt/server
##chown -R hadoop:hadoop /opt/soft
##chown -R hadoop:hadoop /opt/data
##chown -R hadoop:hadoop /opt/var/logsexpress
五、SSH免登陸配置(INVOICE-GL-01, INVOICE-23, INVOICE-24, INVOICE-25四臺管理節點操做):apache
##su - hadoop
##ssh-keygen
##ssh-copy-id INVOICE-GL-01
##ssh-copy-id INVOICE-GL-02
##ssh-copy-id INVOICE-GL-03
##ssh-copy-id INVOICE-23
##ssh-copy-id INVOICE-24
.....
## ssh-copy-id INVOICE-DN-05
## ssh-copy-id INVOICE-DN-06
保證四臺在hadoop用戶下能夠無密碼登陸全部機器
如 ssh INVOICE-23能夠直接登陸bootstrap
#####################################################################
##################如下操做都在hadoop用戶下##########################
#####################################################################
六、全部機器環境變量配置:
## vim /home/hadoop/.bash_profile
################set java home##################
export JAVA_HOME=/opt/server/jdk1.7.0_79
export PATH=$JAVA_HOME/bin:$PATHvim
###############set zk home##################
export ZOOKEEPER_HOME=/opt/server/zookeeper-3.4.6
export PATH=$ZOOKEEPER_HOME/bin:$PATH
############set hadoop home###################
export HADOOP_HOME=/opt/server/hadoop-2.5.2
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
###############set hbase home################
export HBASE_HOME=/opt/server/hbase-1.1.2
export PATH=$HBASE_HOME/bin:$PATH
##source /home/hadoop/.bash_profile
七、JDK部署:
##cd /opt/soft
##tar -zxvf jdk-7u79-linux-x64.tar.gz
##發送到其餘機器(本機爲INVOICE-GL-01)
##scp -r jdk1.7.0_79 INVOICE-GL-01:/opt/server/
##scp -r jdk1.7.0_79 INVOICE-GL-02:/opt/server/
##scp -r jdk1.7.0_79 INVOICE-GL-03:/opt/server/
##scp -r jdk1.7.0_79 INVOICE-23:/opt/server/
...
## scp -r jdk1.7.0_79 INVOICE-DN-05:/opt/server/
## scp -r jdk1.7.0_79 INVOICE-DN-06:/opt/server/
##mv jdk1.7.0_79 /opt/server
##java -version
八、Zookeeper部署:
##cd /opt/soft
##tar -zxvf zookeeper-3.4.6.tar.gz
##cp zookeeper-3.4.6/conf/zoo_sample.cfg zookeeper-3.4.6/conf/zoo.cfg
##vim zookeeper-3.4.6/conf/zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/data/zookeeper
dataLogDir=/opt/var/logs/zookeeper
clientPort=2181
maxClientCnxns=10000
autopurge.snapRetainCount=3
autopurge.purgeInterval=2
server.1=INVOICE-GL-01:2888:3888
server.2= INVOICE-GL-02:2888:3888
server.3= INVOICE-GL-03:2888:3888
##vim zookeeper-3.4.6/conf/java.env
#!/bin/bash
export JAVA_HOME=/opt/server/jdk1.7.0_79
export JVMFLAGS="-Xms2048m -Xmx10240m $JVMFLAGS"
##vim zookeeper-3.4.6/bin/zkEnv.sh
(修改下面的參數zookeeper.out存儲路徑)
ZOO_LOG_DIR="/opt/var/logs/zookeeper"
##發送到其餘機器(本機爲第INVOICE-GL-01臺機器)
##scp -r zookeeper-3.4.6 INVOICE-GL-02:/opt/server/
##scp -r zookeeper-3.4.6 INVOICE-GL-03:/opt/server/
## mv zookeeper-3.4.6 /opt/server/
##第INVOICE-GL-01機器建立相應文件夾
##mkdir -p /opt/data/zookeeper
##mkdir -p /opt/var/logs/zookeeper
##echo "1" > /opt/data/zookeeper/myid
##zkServer.sh start
##第INVOICE-GL-02機器建立相應文件夾
##mkdir -p /opt/data/zookeeper
##mkdir -p /opt/var/logs/zookeeper
##echo "2" > /opt/data/zookeeper/myid
##zkServer.sh start
##第INVOICE-GL-03機器建立相應文件夾
##mkdir -p /opt/data/zookeeper
##mkdir -p /opt/var/logs/zookeeper
##echo "3" > /opt/data/zookeeper/myid
##zkServer.sh start
九、Hadoop部署:
###全部機器安裝hadoop snappy壓縮支持
##yum -y install gcc gcc-c++ automake autoconf libtool
##cd /opt/soft/
##tar -zxvf snappy-1.1.1.tar.gz
## cd snappy-1.1.1
##./configure && make && make install
##cd /opt/soft/
##tar -xvf protobuf-2.5.0.tar
##cd protobuf-2.5.0
##./configure && make && make install
##echo "/usr/local/lib" >>/etc/ld.so.conf
##ldconfig
###第INVOICE-25臺機器安裝hadoop
##cd /opt/soft
##tar -zxvf hadoop-2.5.2.tar.gz
##vim hadoop-2.5.2/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/opt/server/jdk1.7.0_79
export HADOOP_HOME=/opt/server/hadoop-2.5.2
export HADOOP_LOG_DIR=/opt/var/logs/hadoop
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/Linux-amd64-64/:/usr/local/lib/
##vim hadoop-2.5.2/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/data/hadoop</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>120</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec</value></property>
</configuration>
##vim hadoop-2.5.2/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>INVOICE-23:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>INVOICE-24:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>INVOICE-23:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>INVOICE-24:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://INVOICE-25:8485;INVOICE-26:8485;INVOICE-27:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/data/journal/local/data</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>INVOICE-GL-01:2181,INVOICE-GL-02:2181,INVOICE-GL-03:2181</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/data/hadoop1,/opt/data/hadoop2</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>40</value>
</property>
</configuration>
##vim hadoop-2.5.2/etc/hadoop/slaves
INVOICE-25
INVOICE-26
INVOICE-27
INVOICE-DN-01
INVOICE-DN-02
INVOICE-DN-03
INVOICE-DN-04
INVOICE-DN-05
INVOICE-DN-06
###複製snappylib包到hadoop目錄下
##tar -zxvf hadoop-snappy-0.0.1-SNAPSHOT.tar.gz
## cp -r hadoop-snappy-0.0.1-SNAPSHOT/lib/hadoop-snappy-0.0.1-SNAPSHOT.jar hadoop-2.5.2/lib/
## cp -r hadoop-snappy-0.0.1-SNAPSHOT/lib/native/Linux-amd64-64 hadoop-2.5.2/lib/native/
##發送到其餘機器(本機爲第INVOICE-25臺機器)
##scp -r hadoop-2.5.2 INVOICE-GL-01:/opt/server/
##scp -r hadoop-2.5.2 INVOICE-GL-02:/opt/server/
##scp -r hadoop-2.5.2 INVOICE-GL-03:/opt/server/
##scp -r hadoop-2.5.2 INVOICE-23:/opt/server/
...
##scp -r hadoop-2.5.2 INVOICE-DN-05:/opt/server/
##scp -r hadoop-2.5.2 INVOICE-DN-06:/opt/server/
## mv hadoop-2.5.2 /opt/server/
###全部機器建立相應文件夾
mkdir -p /opt/data/hadoop
mkdir -p /opt/var/logs/hadoop
mkdir -p /opt/data/journal/local/data
###初始化hadoop集羣
###第INVOICE-25, INVOICE-26, INVOICE-27臺啓動journalnode
##hadoop-daemon.sh start journalnode
###第INVOICE-23臺格式化namenode並啓動
##hdfs namenode -format
##hadoop-daemon.sh start namenode
###第INVOICE-24臺格式化namenode
##hdfs namenode -bootstrapStandby
###第INVOICE-23臺格式化zkfc
##hdfs zkfc -formatZK
###第INVOICE-25臺先中止再啓動集羣
stop-dfs.sh
start-dfs.sh
添加Yarn服務
# master上配置
# su - hadoop
# cd /opt/server/hadoop-2.5.2/etc/hadoop
# cat ./yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>nn1,nn2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.nn1</name>
<value>INVOICE-23</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.nn2</name>
<value>INVOICE-24</value>
</property>
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>nn1</value> #master2 這裏要改爲nn2
</property>
<property>
<name>yarn.resourcemanager.address.nn1</name>
<value>${yarn.resourcemanager.hostname.nn1}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.nn1</name>
<value>${yarn.resourcemanager.hostname.nn1}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address.nn1</name>
<value>${yarn.resourcemanager.hostname.nn1}:8089</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.nn1</name>
<value>${yarn.resourcemanager.hostname.nn1}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.nn1</name>
<value>${yarn.resourcemanager.hostname.nn1}:8025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.nn1</name>
<value>${yarn.resourcemanager.hostname.nn1}:8041</value>
</property>
<property>
<name>yarn.resourcemanager.address.nn2</name>
<value>${yarn.resourcemanager.hostname.nn2}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.nn2</name>
<value>${yarn.resourcemanager.hostname.nn2}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address.nn2</name>
<value>${yarn.resourcemanager.hostname.nn2}:8089</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.nn2</name>
<value>${yarn.resourcemanager.hostname.nn2}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.nn2</name>
<value>${yarn.resourcemanager.hostname.nn2}:8025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.nn2</name>
<value>${yarn.resourcemanager.hostname.nn2}:8041</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/opt/data/hadoop/yarn</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/opt/var/logs/hadoop</value>
</property>
<property>
<name>yarn.client.failover-proxy-provider</name>
<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
</property>
<property>
<name>yarn.resourcemanager.zk-state-store.address</name>
<value>INVOICE-GL-01:2181,INVOICE-GL-02:2181,INVOICE-GL-03:2181</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>INVOICE-GL-01:2181,INVOICE-GL-02:2181,INVOICE-GL-03:2181</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
</configuration>
# cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
# start-yarn.sh
十、Hbase部署:
###第INVOICE-25臺機器安裝hbase
##cd /opt/soft
##tar -zxvf hbase-1.1.2-bin.tar.gz
##vim hbase-1.1.2/conf/hbase-env.sh
export JAVA_HOME=/opt/server/jdk1.7.0_79
export HADOOP_HOME=/home/hadoop/server/hadoop-2.5.2
export HBASE_HOME=/home/hadoop/server/hbase-1.1.2
export HBASE_MANAGES_ZK=false
export HBASE_LOG_DIR=/opt/var/logs/hbase
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/Linux-amd64-64/:/usr/local/lib/
export HBASE_LIBRARY_PATH=$HBASE_LIBRARY_PATH:$HBASE_HOME/lib/native/Linux-amd64-64/:/usr/local/lib/
export CLASSPATH=$CLASSPATH:$HBASE_LIBRARY_PATH
## vim hbase-1.1.2/conf/hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://mycluster/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>INVOICE-GL-01,INVOICE-GL-02,INVOICE-GL-03</value>
</property>
<property>
<name>hbase.regionserver.codecs</name>
<value>snappy</value>
</property>
<property>
<name>hbase.regionserver.handler.count</name>
<value>500</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>60000</value>
</property>
<property>
<name>hbase.master.distributed.log.splitting</name>
<value>false</value>
</property>
<property>
<name>hbase.rpc.timeout</name>
<value>600000</value>
</property>
<property>
<name>hbase.client.scanner.timeout.period</name>
<value>60000</value>
</property>
<property>
<name>hbase.snapshot.master.timeoutMillis</name>
<value>600000</value>
</property>
<property>
<name>hbase.snapshot.region.timeout</name>
<value>600000</value>
</property>
<property>
<name>hbase.hregion.max.filesize</name>
<value>107374182400</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>180000</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>180000</value>
</property>
<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>10000</value>
</property>
<property>
<name>hbase.regionserver.region.split.policy</name>
<value>org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy</value>
</property>
</configuration>
##vim hbase-1.1.2/conf/regionservers
INVOICE-25
INVOICE-26
INVOICE-27
INVOICE-DN-01
INVOICE-DN-02
INVOICE-DN-03
INVOICE-DN-04
INVOICE-DN-05
INVOICE-DN-06
##vim hbase-1.1.2/conf/backup-masters
INVOICE-GL-02
INVOICE-GL-03
##ln -s /opt/server/hadoop-2.5.2/etc/hadoop/hdfs-site.xml /opt/server/hbase-1.1.2/conf/
##發送到其餘機器(本機爲第INVOICE-25臺機器)
##scp -r hbase-1.1.2 INVOICE-GL-01:/opt/server/
##scp -r hbase-1.1.2 INVOICE-GL-02:/opt/server/
##scp -r hbase-1.1.2 INVOICE-GL-03:/opt/server/
##scp -r hbase-1.1.2 INVOICE-23:/opt/server/
...
##scp -r hbase-1.1.2 INVOICE-DN-05:/opt/server/
##scp -r hbase-1.1.2 INVOICE-DN-06:/opt/server/
###全部機器建立相應文件夾
mkdir -p /opt/var/logs/hbase
###啓動hbase(本機爲第INVOICE-GL-01臺機器)
start-hbase.sh
十一、基本維護命令:
(第INVOICE-GL-01, INVOICE-GL-02, INVOICE-GL-03臺負責啓動和中止,其餘機器只看狀態)
zookeeper啓動,狀態,中止:
INVOICE-GL-01, INVOICE-GL-02, INVOICE-GL-03臺:zkServer.sh start,status,stop
檢查zookeeper運行狀態:
三臺中有一個leader,兩個follower爲正常。
hadoop啓動,中止:
第INVOICE-25臺:
啓動: start-dfs.sh
中止: stop-dfs.sh
狀態: hdfs fsck /
hdfs dfsadmin -report
hbase 啓動,中止:
第INVOICE-GL-01臺:
啓動: start-hbase.sh
中止:stop-hbase.sh
狀態:hbase shell