重啓Hadoop集羣時no namenode to stop的異常

   今天修改了hadoop集羣的配置文件而須要重啓集羣,可是卻報錯以下:
java

[hadoop@master ~]# stop-dfs.sh
Stopping namenodes on [master]
master1: no namenode to stop
master2: no namenode to stop
slave2: no datanode to stop
slave1: no datanode to stop


問題的緣由是hadoop在stop的時候依據的是datanode上的journalnode和dfs的pid。而默認的進程號保存在/tmp下,linux 默認會每隔一段時間(通常是一個月或者7天左右)去刪除這個目錄下的文件。node

所以刪掉hadoop-hadoop-journalnode.pid和hadoop-hadoop-datanode.pid兩個文件後,namenode天然就找不到datanode上的這兩個進程了。linux

在配置文件hadoop_env.sh中配置export HADOOP_PID_DIR能夠解決這個問題, 也能夠在hadoop-deamon.sh中修改,它會調用hadoop_env.sh。修改HADOOP_PID_DIR的路徑爲「/var/hadoop_pid」,記得手動在「/var」目錄下建立hadoop_pid文件夾並將owner權限分配給hadoop用戶。shell

[hadoop@slave3 ~]$ ls /var/hadoop_pid/
hadoop-hadoop-datanode.pid  hadoop-hadoop-journalnode.pid

而後手動在出錯的Slave上殺死Datanode的進程(kill -9 pid),再從新運行start-dfs..sh時發現沒有「no datanode to stop」和「no namenode to stop」的出現,問題解決。app

[hadoop@master1 ~]$ start-dfs.sh
16/04/13 17:20:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [master1 master2]
master1: starting namenode, logging to /data/usr/hadoop/logs/hadoop-hadoop-namenode-master1.out
master2: starting namenode, logging to /data/usr/hadoop/logs/hadoop-hadoop-namenode-master2.out
slave4: starting datanode, logging to /data/usr/hadoop/logs/hadoop-hadoop-datanode-slave4.out
slave3: starting datanode, logging to /data/usr/hadoop/logs/hadoop-hadoop-datanode-slave3.out
slave2: starting datanode, logging to /data/usr/hadoop/logs/hadoop-hadoop-datanode-slave2.out
slave1: starting datanode, logging to /data/usr/hadoop/logs/hadoop-hadoop-datanode-slave1.out
Starting journal nodes [master1 master2 slave1 slave2 slave3]
slave3: starting journalnode, logging to /data/usr/hadoop/logs/hadoop-hadoop-journalnode-slave3.out
master1: starting journalnode, logging to /data/usr/hadoop/logs/hadoop-hadoop-journalnode-master1.out
slave1: starting journalnode, logging to /data/usr/hadoop/logs/hadoop-hadoop-journalnode-slave1.out
master2: starting journalnode, logging to /data/usr/hadoop/logs/hadoop-hadoop-journalnode-master2.out
slave2: starting journalnode, logging to /data/usr/hadoop/logs/hadoop-hadoop-journalnode-slave2.out
16/04/13 17:20:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [master1 master2]
master1: starting zkfc, logging to /data/usr/hadoop/logs/hadoop-hadoop-zkfc-master1.out
master2: starting zkfc, logging to /data/usr/hadoop/logs/hadoop-hadoop-zkfc-master2.out
相關文章
相關標籤/搜索