安裝大概步驟:java
主機IP | hsotname | 角色 |
172.21.25.100 | namenode.yxnrtf.openpf | NameNode |
172.21.25.104 | datanode01.yxnrtf.openpf | DataNode |
172.21.25.105 | datanode02.yxnrtf.openpf | DataNode |
一、JDK安裝node
tar -zxvf jdk-7u80-linux-x64.gz -C /usr/local/java
配置環境變量linux
#java export JAVA_HOME=/usr/local/java/jdk1.7.0_80 export JRE_HOME=$JAVA_HOME/jre export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
驗證:java -versionweb
二、SSH免密登陸apache
各個節點都執行下面該命令,會在/root/下生成.ssh 文件夾,瀏覽器
ssh-keygen –t rsa –P ''
在NameNode節點上將 id_rsa.pub追加到key裏面去bash
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
將DataNode節點的 id_rsa.pub依次 追加到 主節點的 keys裏面服務器
scp ~/.ssh/id_rsa.pub root@172.21.25.100:~/ scp ~/id_rsa.pub hadoop@172.21.25.100:~/authorized_keys
將NameNode節點的authorized_keys 傳到DataNode節點上網絡
scp ~/.ssh/authorized_keys root@172.21.25.104:~/.ssh/ scp ~/.ssh/authorized_keys root@172.21.25.105:~/.ssh/ chmod 600 ~/.ssh/authorized_keys # 各個DataNode節點上修改文件權限
在全部的節點上修改/etc/ssh/sshd_config 文件app
RSAAuthentication yes # 啓用 RSA 認證 PubkeyAuthentication yes # 啓用公鑰私鑰配對認證方式 AuthorizedKeysFile .ssh/authorized_keys # 公鑰文件路徑(和上面生成的文件同) service sshd restart # 重啓ssh服務
驗證 ssh localhost, ssh DataNode IP ,這樣節點之間能夠相互訪問
三、NTPDate 時間同步
時間同步是經過crontab 定時任務 ntpdate 來實現時間同步,不採用NTP是由於時間上超過某個值就沒法實現同步。每5分鐘實現一次同步
crontab -e # crond 命令 */5 * * * * /usr/sbin/ntpdate ntp.oss.XX && hwclock --systohc #ntp.oss.XX 是你的NTP服務器
四、網絡配置
全部機器關閉防火牆,添加/etc/hosts
service iptables status service iptables stop
添加/etc/hosts
172.21.25.100 namenode.yxnrtf.openpf 172.21.25.104 datanode01.yxnrtf.openpf 172.21.25.105 datanode02.yxnrtf.openpf
五、安裝CDH5
在NameNode節點上執行
wget http://archive.cloudera.com/cdh5/one-click-install/redhat/6/x86_64/cloudera-cdh-5-0.x86_64.rpm
禁用GPG簽名檢查,並安裝本地軟件包
yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
添加cloudera倉庫驗證:
rpm --import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
NameNode 節點上安裝 軟件包namenode、resourcemanager、nodemanager、datanode、mapreduce、historyserver、proxyserver和hadoop-client:
yum install hadoop hadoop-hdfs hadoop-client hadoop-doc hadoop-debuginfo hadoop-hdfs-namenode hadoop-yarn-resourcemanager hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce hadoop-mapreduce-historyserver hadoop-yarn-proxyserver -y
DataNode 節點上安裝
yum install hadoop hadoop-hdfs hadoop-client hadoop-doc hadoop-debuginfo hadoop-yarn hadoop-hdfs-datanode hadoop-yarn-nodemanager hadoop-mapreduce -y
SecondaryNamenode 安裝,本人是在NameNode節點上安裝的,若是安裝在其餘服務器上,以後的一些一些配置須要進行修改
yum install hadoop-hdfs-secondarynamenode -y
在/etc/hadoop/conf/hdfs-site.xml 中添加以下配置
<property> <name>dfs.namenode.checkpoint.check.period</name> <value>60</value> </property> <property> <name>dfs.namenode.checkpoint.txns</name> <value>1000000</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>file:///data/cache1/dfs/namesecondary</value> </property> <property> <name>file:///data/cache1/dfs/namesecondary</name> <value>hdfs</value> </property> <property> <name>dfs.namenode.num.checkpoints.retained</name> <value>2</value> </property> <!-- 將namenode.yxnrtf.openpf設置成SecondaryNameNode --> <property> <name>dfs.secondary.http.address</name> <value>namenode.yxnrtf.openpf:50090</value> </property>
在NameNode節點上建立目錄
mkdir -p /data/cache1/dfs/nn chown -R hdfs:hadoop /data/cache1/dfs/nn chmod 700 -R /data/cache1/dfs/nn
在DataNode節點上建立目錄
mkdir -p /data/cache1/dfs/dn mkdir -p /data/cache1/dfs/mapred/local chown -R hdfs:hadoop /data/cache1/dfs/dn chmod 777 -R /data/ usermod -a -G mapred hadoop chown -R mapred:hadoop /data/cache1/dfs/mapred/local
各個節點上在/etc/profile上添加以下配置
export HADOOP_HOME=/usr/lib/hadoop export HIVE_HOME=/usr/lib/hive export HBASE_HOME=/usr/lib/hbase export HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop export HADOOP_YARN_HOME=/usr/lib/hadoop-yarn export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$HBASE_HOME/bin:$PATH
source /etc/profile 讓其生效
在NameNode節點上添加/etc/hadoop/conf/core-site.xml 以下配置
<property> <name>fs.defaultFS</name> <value>hdfs://namenode.yxnrtf.openpf:9000</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>namenode.yxnrtf.openpf</value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>hdfs</value> </property> <property> <name>hadoop.proxyuser.mapred.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.mapred.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.yarn.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.yarn.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.httpfs.hosts</name> <value>httpfs-host.foo.com</value> </property> <property> <name>hadoop.proxyuser.httpfs.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hive.groups</name> <value>*</value> </property>
在/etc/hadoop/conf/hdfs-site.xml添加以下配置
<property> <name>dfs.namenode.name.dir</name> <value>/data/cache1/dfs/nn/</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data/cache1/dfs/dn/</value> </property> <property> <name>dfs.hosts</name> <value>/etc/hadoop/conf/slaves</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.permissions.superusergroup</name> <value>hdfs</value> </property>
在/etc/hadoop/conf/mapred-site.xml添加以下配置
<property> <name>mapreduce.jobhistory.address</name> <value>namenode.yxnrtf.openpf:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>namenode.yxnrtf.openpf:19888</value> </property> <property> <name>mapreduce.jobhistory.joblist.cache.size</name> <value>50000</value> </property> <!-- 前面在HDFS上建立的目錄 --> <property> <name>mapreduce.jobhistory.done-dir</name> <value>/user/hadoop/done</value> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>/user/hadoop/tmp</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
在/etc/hadoop/conf/yarn-site.xml 添加以下配置
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <description>List of directories to store localized files in.</description> <name>yarn.nodemanager.local-dirs</name> <value>/var/lib/hadoop-yarn/cache/${user.name}/nm-local-dir</value> </property> <property> <description>Where to store container logs.</description> <name>yarn.nodemanager.log-dirs</name> <value>/var/log/hadoop-yarn/containers</value> </property> <property> <description>Where to aggregate logs to.</description> <name>yarn.nodemanager.remote-app-log-dir</name> <value>hdfs://namenode.yxnrtf.openpf:9000/var/log/hadoop-yarn/apps</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>namenode.yxnrtf.openpf:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>namenode.yxnrtf.openpf:8030</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>namenode.yxnrtf.openpf:8088</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>namenode.yxnrtf.openpf:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>namenode.yxnrtf.openpf:8033</value> </property> <property> <description>Classpath for typical applications.</description> <name>yarn.application.classpath</name> <value> $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*, $HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*, $HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*, $HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*, $HADOOP_YARN_HOME/lib/* </value> </property> <property> <name>yarn.web-proxy.address</name> <value>namenode.yxnrtf.openpf:54315</value> </property>
修改/etc/hadoop/conf/slaves
datanode01.yxnrtf.openpf datanode02.yxnrtf.openpf
在yarn.env.xml 註釋的下面添加
export JAVA_HOME=/usr/local/java/jdk1.7.0_80
將/etc/hadoop/conf 這個目錄拷貝到datanode節點上
scp -r conf/ root@172.21.25.104:/etc/hadoop/ scp -r conf/ root@172.21.25.105:/etc/hadoop/
NameNode節點上啓動
hdfs namenode –format service hadoop-hdfs-namenode init service hadoop-hdfs-namenode start service hadoop-yarn-resourcemanager start service hadoop-yarn-proxyserver start service hadoop-mapreduce-historyserver start
DataNode節點上啓動
service hadoop-hdfs-datanode start service hadoop-yarn-nodemanager start
在瀏覽器中查看
HDFS |
|
ResourceManager(Yarn) |
|
http://172.21.25.100:8088/cluster/nodes |
在線的節點 |
NodeManager |
|
http://172.21.25.100:19888/ |
JobHistory |
六、ZOOKEEPER安裝
各個節點運行以下命名
yum install zookeeper* -y
NameNode節點上修改配置文件/etc/zookeeper/conf/zoo.cfg
#clean logs autopurge.snapRetainCount=3 autopurge.purgeInterval=1 server.1=namenode.yxnrtf.openpf:2888:3888 server.2=datanode01.yxnrtf.openpf:2888:3888 server.3=datanode01.yxnrtf.openpf:2888:3888
在NameNode上啓動
service zookeeper-server init --myid=1 service zookeeper-server start
DataNode 節點1上啓動
service zookeeper-server init --myid=2 service zookeeper-server start
DataNode 節點2上啓動
service zookeeper-server init --myid=3 service zookeeper-server start
此處須要注意的是--myid的值的設置要和配置文件中的一直
驗證啓動狀況,在NameNode節點上運行命令
zookeeper-client -server traceMaster:2181