準備工做中要求有html
1.centOs 6.4,添加hadoop用戶,配置集羣內的/etc/hosts文件。java
2.安裝hadoop用戶的ssh,並打通集羣內全部機器,(ha執行fencing時會用到)。node
3.下載社區版hadoop-2.2.0源碼。c++
(編譯hadoop 2.2.0所須要的軟件可在此處下載:http://pan.baidu.com/s/1mgodf40,英文參考:http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/NativeLibraries.html)web
--------------------------------------------------------------------------------------------
yum -y install lzo-devel zlib-devel gcc autoconf automake libtool gcc-c++
yum install openssl-devel
yum install ncurses-devel
--------------------------------------------------------------------------------------------
Ant Maven ProtocolBuffer
findbugs CMake apache
#安裝javabootstrap
yum -y install jdk bash
Protobuf 編譯安裝
tar -zxvf protobuf-2.5.0.tar.gz cd protobuf-2.5.0
./configure --prefix=/usr/local/protobuf make make install
Ant 安裝
tar -zxvf apache-ant-1.9.2-bin.tar.gz mv apache-ant-1.9.2/ /usr/local/ant
maven 安裝
tar -zxvf apache-maven-3.0.5-bin.tar.gz mv apache-maven-3.0.5/ /usr/local/maven
findbugs 安裝
tar -zxfv findbugs-2.0.2.tar.gz
mv findbugs-2.0.2/ /usr/local/findbugs
cmake 編譯安裝
tar -zvxf cmake-2.8.6.tar.gz cd cmake-2.8.6 ./bootstrap make make install app
-------------------------------------------------------------------------------------------- 配置環境 ssh
#根據本身的環境具體配置
vi /etc/profile #java
export JAVA_HOME=/usr/java/jdk1.7.0_45 export JRE_HOME=/usr/java/jdk1.7.0_45/jre
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin #maven
export MAVEN_HOME=/usr/local/maven export MAVEN_OPTS="-Xms256m -Xmx512m" export CLASSPATH=.:$CLASSPATH:$MAVEN_HOME/lib export PATH=$PATH:$MAVEN_HOME/bin
#protobuf
export PROTOBUF_HOME=/usr/local/protobuf
export CLASSPATH=.:$CLASSPATH:$PROTOBUF_HOME/lib export PATH=$PATH:$PROTOBUF_HOME/bin #ant
export ANT_HOME=/usr/local/ant
export CLASSPATH=.:$CLASSPATH:$ANT_HOME/lib export PATH=$PATH:$ANT_HOME/bin
#findbugs
export FINDBUGS_HOME=/usr/local/findbugs
export CLASSPATH=.:$CLASSPATH:$FINDBUGS_HOME/lib export PATH=$PATH:$FINDBUGS_HOME/bin
source /etc/profile
--------------------------------------------------------------------------------------------
vi /hadoop-2.2.0/hadoop-common-project/hadoop-auth/pom.xml
<dependency>
<groupid>org.mortbay.jetty</groupid>
<artifactid>jetty</artifactid> <scope>test</scope> </dependency>
在上面代碼後添加下面代碼 <dependency>
<groupid>org.mortbay.jetty</groupid> <artifactid>jetty-util</artifactid> <scope>test</scope> </dependency>
注:不更改可能報下面錯位 [ERROR]
Failed
to
execute
goal
org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hadoop-auth: Compilation failure: Compilation failure:
----------------------------------------------------------------------------------------------
從新編譯:
tar -zvxf hadoop-2.2.0-src.tar cd hadoop-2.2.0-src
mvn clean package -DskipTests -Pdist,native,docs -Dtar # 漫長等待
(注:可能存在glibc版本問題,此類問題網上有較多評論可供參考)
解壓 hadoop-2.2.0.tar.gz
配置 hadoop用戶的 ~/.bashrc 以下:
# User specific environment and startup programs
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=/usr/local/common/hadoop/conf
export HBASE_HOME=/usr/local/hbase
export HBASE_CONF_DIR=/usr/local/common/hbase/conf
export JAVA_HOME=/usr/java
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib/rt.jar
export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH:$HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:/usr/local/zookeeper/bin:/data1/script
alias jps="jps -J-Djava.io.tmpdir=$HOME"
alias jstat="jstat -J-Djava.io.tmpdir=$HOME"
source ~/.bashrc
在 $HADOOP_CONF_DIR目錄下編輯hadoop的配置文件。
#配置 hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.nameservices</name>
<value>hbaseCluster</value>
</property>
<property>
<name>dfs.ha.namenodes.hbaseCluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hbaseCluster.nn1</name>
<value>h112191.mars.grid.sina.com.cn:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hbaseCluster.nn1</name>
<value>h112191.mars.grid.sina.com.cn:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hbaseCluster.nn2</name>
<value>h112192.mars.grid.sina.com.cn:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hbaseCluster.nn2</name>
<value>h112192.mars.grid.sina.com.cn:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>file:///data1/hadoop/namenode_nfs</value>
<description>指定用於HA存放edits的共享存儲,一般是NFS掛載點</description>
</property>
<!--
<property>
<name>dfs.namenode.rpc-address.ns2</name>
<value>h112192.mars.grid.sina.com.cn:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns2</name>
<value>h112192.mars.grid.sina.com.cn:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns2</name>
<value>h112192.mars.grid.sina.com.cn:50090</value>
</property>
-->
<property>
<name>dfs.replication</name>
<value>3</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///data1/hadoop/namenode</value>
<final>true</final>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data11/hadoop/data/datanode,/data2/hadoop/data/datanode,/data3/hadoop/data/datanode,/data4/hadoop/data/datanode,/data5/hadoop/data/datanode,/data6/hadoop/data/datanode,/data7/hadoop/data/datanode,/data8/hadoop/data/datanode,/data9/hadoop/data/datanode,/data10/hadoop/data/datanode</value>
<final>true</final>
</property>
<property>
<name>fs.checkpoint.dir</name>
<value>/data1/hadoop/namesecondary</value>
<final>true</final>
</property>
<property>
<name>dfs.block.size</name>
<value>134217728</value>
<final>true</final>
</property>
<property>
<name>dfs.hosts</name>
<value>/usr/local/common/hadoop/conf/include</value>
<final>true</final>
</property>
<property>
<name>dfs.hosts.exclude</name>
<value>/usr/local/common/hadoop/conf/exclude</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>8192</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>128</value>
</property>
<property>
<name>dfs.datanode.handler.count</name>
<value>32</value>
</property>
<property>
<name>dfs.web.ugi</name>
<value>hadoop,supergroup</value>
</property>
<property>
<name>dfs.balance.bandwidthPerSec</name>
<value>52428800</value>
</property>
<!--
Configuring automatic failover
-->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>zk4.mars.grid.sina.com.cn:2181,zk3.mars.grid.sina.com.cn:2181,zk2.mars.grid.sina.com.cn:2181,zk1.mars.grid.sina.com.cn:2181,zk5.mars.grid.sina.com.cn:2181</value>
</property>
<!--at least on fecing method-->
<property>
<name>dfs.client.failover.proxy.provider.hbaseCluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence(hadoop:26387)</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>10000</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/usr/home/hadoop/.ssh/id_rsa</value>
</property>
</configuration>
#配置core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<!-- Config verison 1.0 -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hbaseCluster</value>
<description>缺省文件服務的協議和NS邏輯名稱,和hdfs-site裏的對應
此配置替代了1.0裏的fs.default.name</description>
</property>
<!--
<property>
<name>fs.defaultFS</name>
<value>hdfs://h112191.mars.grid.sina.com.cn:9000</value>
<final>true</final>
</property>
-->
<property>
<name>fs.trash.interval</name>
<value>30</value>
<final>true</final>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}-${hue.suffix}</value>
<final>true</final>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec</value>
<final>true</final>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
</configuration>
#配置 slaves文件
hostname1
hostname2
hostname3
首先,配置自動ha後,須要先啓動全部的journalnode,須要到各journalnode機器上執行:
hadoop-daemon.sh start journalnode
其次,hdfs namenode -format [<clusterID>],在某一臺namenode上執行便可,若另外一臺沒法啓動,則將集羣停掉,將namenode的目錄複製過去便可
再次,格式化ha的zk監控
$hdfs zkfc -formatZK
最後,啓動HA,此處參考英文版的說明:
最後啓動所有進程:
$start-dfs.sh
kill active namenode,standby namenode 變成 active,耗費時間 3s
yarn proxyserver