機器分佈node
hadoop1 192.168.56121linux
hadoop2 192.168.56122web
hadoop3 192.168.56123apache
準備安裝包bootstrap
jdk-7u71-linux-x64.tar.gzbash
zookeeper-3.4.9.tar.gzssh
hadoop-2.9.2.tar.gzide
把安裝包上傳到三臺機器的/usr/local目錄下並解壓oop
配置hosts
測試
echo "192.168.56.121 hadoop1" >> /etc/hosts echo "192.168.56.122 hadoop2" >> /etc/hosts echo "192.168.56.123 hadoop3" >> /etc/hosts
配置環境變量
/etc/profile
export HADOOP_PREFIX=/usr/local/hadoop-2.9.2 export JAVA_HOME=/usr/local/jdk1.7.0_71
部署zookeeper
建立zoo用戶
useradd zoo passwd zoo
修改zookeeper目錄的屬主爲zoo
chown zoo:zoo -R /usr/local/zookeeper-3.4.9
修改zookeeper配置文件
到/usr/local/zookeeper-3.4.9/conf目錄
cp zoo_sample.cfg zoo.cfg vi zoo.cfg tickTime=2000 initLimit=10 syncLimit=5 dataDir=/usr/local/zookeeper-3.4.9 clientPort=2181 server.1=hadoop1:2888:3888 server.2=hadoop2:2888:3888 server.3=hadoop3:2888:3888
建立myid文件放在/usr/local/zookeeper-3.4.9目錄下,myid文件中只保存1-255的數字,與zoo.cfg中server.id行中的id相同。
hadoop1中myid爲1
hadoop2中myid爲2
hadoop3中myid爲3
在三臺機器啓動zookeeper服務
[zoo@hadoop1 zookeeper-3.4.9]$ bin/zkServer.sh start
驗證zookeeper
[zoo@hadoop1 zookeeper-3.4.9]$ bin/zkServer.sh status ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper-3.4.9/bin/../conf/zoo.cfg Mode: follower
配置Hadoop
建立用戶
useradd hadoop passwd hadoop
修改hadoop目錄屬主爲hadoop
chmod hadoop:hadoop -R /usr/local/hadoop-2.9.2
建立目錄
mkdir /hadoop1 /hadoop2 /hadoop3 chown hadoop:hadoop /hadoop1 chown hadoop:hadoop /hadoop2 chown hadoop:hadoop /hadoop3
配置互信
ssh-keygen ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop1 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop2 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop3 #使用以下命令測試互信 ssh hadoop1 date ssh hadoop2 date ssh hadoop3 date
配置環境變量
/home/hadoop/.bash_profile
export PATH=$JAVA_HOME/bin:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin:$PATH
配置參數
etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/local/jdk1.7.0_71
etc/hadoop/core-site.xml
<!-- 指定hdfs的nameservice爲ns --> <property> <name>fs.defaultFS</name> <value>hdfs://ns</value> </property> <!--指定hadoop數據臨時存放目錄--> <property> <name>hadoop.tmp.dir</name> <value>/usr/loca/hadoop-2.9.2/temp</value> </property> <property> <name>io.file.buffer.size</name> <value>4096</value> </property> <!--指定zookeeper地址--> <property> <name>ha.zookeeper.quorum</name> <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value> </property>
etc/hadoop/hdfs-site.xml
<!--指定hdfs的nameservice爲ns,須要和core-site.xml中的保持一致 --> <property> <name>dfs.nameservices</name> <value>ns</value> </property> <!-- ns下面有兩個NameNode,分別是nn1,nn2 --> <property> <name>dfs.ha.namenodes.ns</name> <value>nn1,nn2</value> </property> <!-- nn1的RPC通訊地址 --> <property> <name>dfs.namenode.rpc-address.ns.nn1</name> <value>hadoop1:9000</value> </property> <!-- nn1的http通訊地址 --> <property> <name>dfs.namenode.http-address.ns.nn1</name> <value>hadoop1:50070</value> </property> <!-- nn2的RPC通訊地址 --> <property> <name>dfs.namenode.rpc-address.ns.nn2</name> <value>hadoop2:9000</value> </property> <!-- nn2的http通訊地址 --> <property> <name>dfs.namenode.http-address.ns.nn2</name> <value>hadoop2:50070</value> </property> <!-- 指定NameNode的元數據在JournalNode上的存放位置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/ns</value> </property> <!-- 指定JournalNode在本地磁盤存放數據的位置 --> <property> <name>dfs.journalnode.edits.dir</name> <value>/hadoop1/hdfs/journal</value> </property> <!-- 開啓NameNode故障時自動切換 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 配置失敗自動切換實現方式 --> <property> <name>dfs.client.failover.proxy.provider.ns</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔離機制,若是ssh是默認22端口,value直接寫sshfence便可 --> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!-- 使用隔離機制時須要ssh免登錄 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/hadoop1/hdfs/name,file:/hadoop2/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/hadoop1/hdfs/data,file:/hadoop2/hdfs/data,file:/hadoop3/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <!-- 在NN和DN上開啓WebHDFS (REST API)功能,不是必須 --> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <!-- List of permitted/excluded DataNodes. --> <name>dfs.hosts.exclude</name> <value>/usr/local/hadoop-2.9.2/etc/hadoop/excludes</value> </property>
etc/hadoop/mapred-site.xml
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> yarn-site.xml <!-- 指定nodemanager啓動時加載server的方式爲shuffle server --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <!-- 指定resourcemanager地址 --> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop1</value> </property>
etc/hadoop/slaves
hadoop1 hadoop2 hadoop3
首次啓動命令
一、首先啓動各個節點的Zookeeper,在各個節點上執行如下命令: bin/zkServer.sh start 二、在某一個namenode節點執行以下命令,建立命名空間 hdfs zkfc -formatZK 三、在每一個journalnode節點用以下命令啓動journalnode sbin/hadoop-daemon.sh start journalnode 四、在主namenode節點格式化namenode和journalnode目錄 hdfs namenode -format ns 五、在主namenode節點啓動namenode進程 sbin/hadoop-daemon.sh start namenode 六、在備namenode節點執行第一行命令,這個是把備namenode節點的目錄格式化並把元數據從主namenode節點copy過來,而且這個命令不會把journalnode目錄再格式化了!而後用第二個命令啓動備namenode進程! hdfs namenode -bootstrapStandby sbin/hadoop-daemon.sh start namenode 七、在兩個namenode節點都執行如下命令 sbin/hadoop-daemon.sh start zkfc 八、在全部datanode節點都執行如下命令啓動datanode sbin/hadoop-daemon.sh start datanode
平常啓停命令
#啓動腳本,啓動全部節點服務 sbin/start-dfs.sh #中止腳本,中止全部節點服務 sbin/stop-dfs.sh驗證
jps檢查進程
http://192.168.56.122:50070
http://192.168.56.121:50070
測試文件上傳下載
#建立目錄 [hadoop@hadoop1 ~]$ hadoop fs -mkdir /test #驗證 [hadoop@hadoop1 ~]$ hadoop fs -ls / Found 1 items drwxr-xr-x - hadoop supergroup 0 2019-04-12 12:16 /test #上傳文件 [hadoop@hadoop1 ~]$ hadoop fs -put /usr/local/hadoop-2.9.2/LICENSE.txt /test #驗證 [hadoop@hadoop1 ~]$ hadoop fs -ls /test Found 1 items -rw-r--r-- 2 hadoop supergroup 106210 2019-04-12 12:17 /test/LICENSE.txt #下載文件到/tmp [hadoop@hadoop1 ~]$ hadoop fs -get /test/LICENSE.txt /tmp #驗證 [hadoop@hadoop1 ~]$ ls -l /tmp/LICENSE.txt -rw-r--r--. 1 hadoop hadoop 106210 Apr 12 12:19 /tmp/LICENSE.txt