配置 hadoop+yarn+hbase+storm+kafka+spark+zookeeper 高可用集羣,同時安裝相關組建:JDK,MySQL,Hive,Flumehtml
虛擬機數量:8 臺java
操做系統版本:CentOS-7-x86_64-Minimal-1611.isonode
每臺虛擬機的配置以下:mysql
虛擬機名稱 | CPU核心數 | 內存(G) | 硬盤(G) | 網卡 |
---|---|---|---|---|
hadoop1 | 2 | 8 | 100 | 2 |
hadoop2 | 2 | 8 | 100 | 2 |
hadoop3 | 2 | 8 | 100 | 2 |
hadoop4 | 2 | 8 | 100 | 2 |
hadoop5 | 2 | 8 | 100 | 2 |
hadoop6 | 2 | 8 | 100 | 2 |
hadoop7 | 2 | 8 | 100 | 2 |
hadoop8 | 2 | 8 | 100 | 2 |
8節點Hadoop+Yarn+Spark+Hbase+Kafka+Storm+ZooKeeper高可用集羣搭建:linux
集羣 | 虛擬機節點 |
---|---|
HadoopHA集羣 | hadoop1,hadoop2,hadoop3,hadoop4,hadoop5,hadoop6,hadoop7,hadoop8 |
YarnHA集羣 | hadoop1,hadoop2,hadoop3,hadoop4,hadoop5,hadoop6,hadoop7,hadoop8 |
ZooKeeper集羣 | hadoop3,hadoop4,hadoop5 |
Hbase集羣 | hadoop3,hadoop4,hadoop5,hadoop6,hadoop7 |
Kafka集羣 | hadoop6,hadoop7,hadoop8 |
Storm集羣 | hadoop3,hadoop4,hadoop5,hadoop6,hadoop7 |
SparkHA集羣 | hadooop1,hadoop2,hadoop3,hadoop4,hadoop5,hadoop6,hadoop7,hadoop8 |
集羣詳細規劃:git
虛擬機名稱 | IP | 安裝軟件 | 進程 | 功能 |
---|---|---|---|---|
hadoop1 | 59.68.29.79 | jdk,hadoop,mysql | NameNode,ResourceManeger,DFSZKFailoverController(zkfc),master(spark) | hadoop的NameNode節點,spark的master節點,yarn的ResourceManeger節點 |
hadoop2 | 10.230.203.11 | jdk,hadoop,spark | NameNode,ResourceManeger,DFSZKFailoverController(zkfc),worker(spark) | hadoop(yarn)的容災節點,spark的容災節點 |
hadoop3 | 10.230.203.12 | jdk,hadoop,zookeeper,hbase,storm,spark | DataNode,NodeManager,journalnode,QuorumPeerMain(zk),HMaster,…(storm),worker(spark) | storm,hbase,zookeeper的主節點 |
hadoop4 | 10.230.203.13 | jdk,hadoop,zookeeper,hbase,storm,spark | DataNode,NodeManager,journalnode,QuorumPeerMain(zk),HRegionServer,…(storm),worker(spark) | |
hadoop5 | 10.230.203.14 | jdk,hadoop,zookeeper,hbase,storm,spark | DataNode,NodeManager,journalnode,QuorumPeerMain(zk),HRegionServer,…(storm),worker(spark) | |
hadoop6 | 10.230.203.15 | jdk,hadoop,hbase,storm,kafka,spark | DataNode,NodeManager,journalnode,kafka,HRegionServer,…(storm),worker(spark) | kafka的主節點 |
hadoop7 | 10.230.203.16 | jdk,hadoop,hbase,storm,kafka,spark | DataNode,NodeManager,journalnode,kafka,HRegionServer,…(storm),worker(spark) | |
hadoop8 | 10.230.203.17 | jdk,hadoop,kafka,spark | DataNode,NodeManager,journalnode,kafka,worker(spark) |
JDK版本: jdk-8u65-linux-x64.tar.gzweb
hadoop版本: hadoop-2.7.6.tar.gzsql
zookeeper版本: zookeeper-3.4.12.tar.gzshell
hbase版本: hbase-1.2.6-bin.tar.gz數據庫
Storm版本: apache-storm-1.1.3.tar.gz
kafka版本: kafka_2.11-2.0.0.tgz
MySQL版本: mysql-5.6.41-linux-glibc2.12-x86_64.tar.gz
hive版本: apache-hive-2.3.3-bin.tar.gz
Flume版本: apache-flume-1.8.0-bin.tar.gz
Spark版本: spark-2.3.1-bin-hadoop2.7.tgz
每臺主機節點都進行相同設置
千萬注意:不要在root權限下配置集羣
$> groupadd centos
$> useradd centos -g centos
$> passwd centos
$> nano /etc/sudoers 添加以下語句: ## Allow root to run any commands anywhere root ALL=(ALL) ALL centos ALL=(ALL) ALL
$> sudo nano /etc/hostname 用戶名:hadoop1,hadoop2.....
$> sudo nano /etc/hosts 添加內容以下: 127.0.0.1 localhost 59.68.29.79 hadoop1 10.230.203.11 hadoop2 10.230.203.12 hadoop3 10.230.203.13 hadoop4 10.230.203.14 hadoop5 10.230.203.15 hadoop6 10.230.203.16 hadoop7 10.230.203.17 hadoop8
命令:pwd。形如 ~ 轉換爲 /home/centos。方便肯定當前文件的路徑
[centos@hadoop1 ~]$ sudo nano /etc/profile 在末尾添加: export PS1='[\u@\h `pwd`]\$' // source /etc/profile 立刻生效 [centos@hadoop1 /home/centos]$
hadoop1 和 hadoop2 是容災節點(解決單點故障問題),因此這兩個主機除了能互相訪問以外,還須要登陸其餘主機節點,能夠免密登陸
[centos@hadoop1 /home/centos]$ yum list installed | grep ssh
[centos@hadoop1 /home/centos]$ ps -Af | grep sshd
[centos@hadoop1 /home/centos]$ mkdir .ssh [centos@hadoop1 /home/centos]$ chmod 700 ~/.ssh
//生成祕鑰對 [centos@hadoop1 /home/centos]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa //進入 ~/.ssh 文件夾下 [centos@hadoop1 /home/centos]$ cd ~/.ssh //追加公鑰到~/.ssh/authorized_keys文件中 [centos@hadoop1 /home/centos/.ssh]$ cat id_rsa.pub >> authorized_keys // 修改authorized_keys文件的權限爲644 [centos@hadoop1 /home/centos/.ssh]$ chmod 644 authorized_keys
//重名名 [centos@hadoop2 /home/centos/.ssh]$ mv id_rsa.pub id_rsa_hadoop1.pub [centos@hadoop1 /home/centos/.ssh]$ scp id_rsa_hadoop1.pub centos@hadoop2:/home/centos/.ssh/authorized_keys [centos@hadoop1 /home/centos/.ssh]$ scp id_rsa_hadoop1.pub centos@hadoop3:/home/centos/.ssh/authorized_keys [centos@hadoop1 /home/centos/.ssh]$ scp id_rsa_hadoop1.pub centos@hadoop4:/home/centos/.ssh/authorized_keys [centos@hadoop1 /home/centos/.ssh]$ scp id_rsa_hadoop1.pub centos@hadoop5:/home/centos/.ssh/authorized_keys [centos@hadoop1 /home/centos/.ssh]$ scp id_rsa_hadoop1.pub centos@hadoop6:/home/centos/.ssh/authorized_keys [centos@hadoop1 /home/centos/.ssh]$ scp id_rsa_hadoop1.pub centos@hadoop7:/home/centos/.ssh/authorized_keys [centos@hadoop1 /home/centos/.ssh]$ scp id_rsa_hadoop1.pub centos@hadoop8:/home/centos/.ssh/authorized_keys
//生成祕鑰對 [centos@hadoop2 /home/centos]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa //重名名 [centos@hadoop2 /home/centos/.ssh]$ mv id_rsa.pub id_rsa_hadoop2.pub //追加公鑰到~/.ssh/authorized_keys文件中 [centos@hadoop1 /home/centos/.ssh]$ cat id_rsa_hadoop2.pub >> authorized_keys //將authorized_keys分發給其餘節點 [centos@hadoop1 /home/centos/.ssh]$ scp authorized_keys centos@hadoop:/home/centos/.ssh/ ... 分發給其餘節點
爲了保證集羣正常啓動,先要關閉各臺主機的防火牆,一些命令以下:
[cnetos 6.5以前的版本] $>sudo service firewalld stop //中止服務 $>sudo service firewalld start //啓動服務 $>sudo service firewalld status //查看狀態 [centos7] $>sudo systemctl enable firewalld.service //"開機啓動"啓用 $>sudo systemctl disable firewalld.service //"開機自啓"禁用 $>sudo systemctl start firewalld.service //啓動防火牆 $>sudo systemctl stop firewalld.service //中止防火牆 $>sudo systemctl status firewalld.service //查看防火牆狀態 [開機自啓] $>sudo chkconfig firewalld on //"開啓自啓"啓用 $>sudo chkconfig firewalld off //"開啓自啓"禁用
提示:爲了全局可用,腳本都放在 /usr/local/bin 目錄下。只在hadoop1和hadoop2節點配置
//以本地用戶身份建立xcall.sh $>touch ~/xcall.sh //centos //將其複製到 /usr/local/bin 目錄下 $>sudo mv xcall.sh /usr/local/bin //修改權限 $>sudo chmod a+x xcall.sh //添加腳本 $>sudo nano xcall.sh
#!/bin/bash params=$@ i=1 for (( i=1 ; i <= 8 ; i = $i + 1 )) ; do echo ============= s$i $params ============= ssh hadoop$i "$params" done
#!/bin/bash if [[ $# -lt 1 ]] ; then echo no params ; exit ; fi p=$1 #echo p=$p dir=`dirname $p` #echo dir=$dir filename=`basename $p` #echo filename=$filename cd $dir fullpath=`pwd -P .` #echo fullpath=$fullpath user=`whoami` for (( i = 1 ; i <= 8 ; i = $i + 1 )) ; do echo ======= hadoop$i ======= rsync -lr $p ${user}@hadoop$i:$fullpath done ;
準備JDK:jdk-8u65-linux-x64.tar.gz,將其上傳到主機hadoop1的 /home/centos/localsoft 目錄下,該目錄用於存放全部須要安裝的軟件安裝包
在根目錄下(/)新建一個 soft 文件夾,並將該文件夾的用戶組權限和用戶權限改成 centos,該文件夾下爲全部須要安裝的軟件
//建立soft文件夾 [centos@hadoop1 /home/centos]$ sudo mkdir /soft //修改權限(centosmin0是本身的本機用戶名) [centos@hadoop1 /home/centos]$ sudo chown centos:centos /soft
// 從 /home/centos/localsoft 下解壓到 /soft [centos@hadoop1 /home/centos/localsoft]$ tar -xzvf jdk-8u65-linux-x64.tar.gz -C /soft // 建立符號連接 [centos@hadoop1 /soft]$ ln -s /soft/jdk1.8.0_65 jdk
// 進入profile [centos@hadoop1 /home/centos]$ sudo nano /etc/profile // 環境變量 # jdk export JAVA_HOME=/soft/jdk export PATH=$PATH:$JAVA_HOME/bin // source 當即生效 [centos@hadoop1 /home/centos]$ source /etc/profile
[centos@hadoop1 /home/centos]$ java -version // 顯示以下 java version "1.8.0_65" Java(TM) SE Runtime Environment (build 1.8.0_65-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
// 從 /home/centos/localsoft 下解壓到 /soft [centos@hadoop1 /home/centos/localsoft]$ tar -xzvf hadoop-2.7.6.tar.gz -C /soft // 建立符號連接 [centos@hadoop1 /soft]$ ln -s /soft/hadoop-2.7.6 hadoop
// 進入profile [centos@hadoop1 /home/centos]$ sudo nano /etc/profile // 環境變量 # hadoop export HADOOP_HOME=/soft/hadoop export PATH=$PATH:$HADOOP_HOME/bin/:$HADOOP_HOME/sbin // source 當即生效 [centos@hadoop1 /home/centos]$ source /etc/profilea // 檢測是否安裝成功 [centos@hadoop1 /home/centos]$ hadoop version 顯示以下: Hadoop 2.7.6 Subversion https://shv@git-wip-us.apache.org/repos/asf/hadoop.git -r 085099c66cf28be31604560c376fa282e69282b8 Compiled by kshvachk on 2018-04-18T01:33Z Compiled with protoc 2.5.0 From source with checksum 71e2695531cb3360ab74598755d036 This command was run using /soft/hadoop-2.7.6/share/hadoop/common/hadoop-common-2.7.6.jar
提示: 如今的操做在hadoop1節點上,先不用在其餘節點進行安裝配置,等後續配置結束後再一塊兒將配置傳給其餘節點,能大大節省工做量。
基於hadoop的原生NameNode HA搭建,後面會與zookeeper集羣進行整合,實現自動容災(Yarn+NameNode)
[centos@hadoop1 /soft/hadoop/etc]$ cp hadoop ha [centos@hadoop1 /soft/hadoop/etc]$ cp hadoop full [centos@hadoop1 /soft/hadoop/etc]$ cp hadoop pesudo // 建立符號連接 [centos@hadoop1 /soft/hadoop/etc]$ ln -s /soft/hadoop/etc/ha hadoop
[core-site.xml]
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <!--- 配置新的本地目錄 --> <property> <name>hadoop.tmp.dir</name> <value>/home/centos/hadoop</value> </property> <property> <name>ipc.client.connect.max.retries</name> <value>20</value> </property> <property> <name>ipc.client.connect.retry.interval</name> <value>5000</value> </property> </configuration>
[hdfs-site.xml]
<configuration> <!-- 配置nameservice --> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <!-- myucluster下的名稱節點兩個id --> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <!-- 配置每一個nn的rpc地址 --> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>hadoop1:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>hadoop2:8020</value> </property> <!-- 配置webui端口 --> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>hadoop1:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>hadoop2:50070</value> </property> <!-- 名稱節點共享編輯目錄 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop3:8485;hadoop4:8485;hadoop5:8485;hadoop6:8485;hadoop7:8485;hadoop8:8485/mycluster</value> </property> <!-- java類,client使用它判斷哪一個節點是激活態 --> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!