WARN org.apache.hadoop.hdfs.servåer.namenode.FSNamesystem: Get corrupt file b

Hadoop運維筆記 之 Namenode異常中止後沒法正常啓動

存儲,學習,共享。。。。javascript

公司在線上使用了CDH5 HA模式,有兩個Namenode節點,結果其中的Standby節點由於一些關於edits文件的報錯異常中止了,而且在啓動的過程當中一直報告找不到各類文件。html

剛開始懷疑問題可能只發生在Standby自己,所以嘗試了bootstrapStandby來從新初始化Standby節點,但問題依舊。然後來由於我嘗試重啓ZKFC(Zookeeper Failover)服務器,致使了Active節點進行自動切換,在切換失敗後打算切換回去時,也沒法啓動服務了,報錯跟Standby節點如出一轍,因而整個Hadoop集羣就掛了。 java

問題嚴重,在搜遍了整個Google都找不到任何有用的信息以後,只能求助於老大。最後,老大想到一個思路,就是將fsimage(元數據)文件與edits(編輯日誌)文件都反編譯成文本,查看裏面具體有什麼內容,爲何加載edits文件時會報錯。node

結果,這個思路給咱們帶來了曙光,並最終修復了整個集羣。python

環境介紹:linux

idc2-server1: namenode, journalnode, zkfcsql

idc2-server2: namenode, journalnode, zkfcapache

idc2-server3: journalnode, resourcemanagerbootstrap

具體過程:vim

首先,是Standby Namenode上出現如下錯誤,而後自動異常關閉了進程:

2014-11-11 02:12:54,057 FATAL org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unknown error encountered while tailing edits. Shutting down standby NN.
java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015
       at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
       at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
...

其中提到了"Unknown error encountered while tailing edits. Shutting down standby NN."

因而,咱們嘗試啓動Standby Namenode服務,結果報告如下錯誤:

2014-11-12 04:26:28,860 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding stream 'http://idc2-server10.heylinux.com:8480/getJournal?jid=idc2&segmentTxId=240823073&storageInfo=-55%3A1838233660%3A0%3ACID-d77ea84b-1b24-4bc2-ad27-7d2384d222d6' to transaction ID 240741256
2014-11-12 04:26:28,874 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation CloseOp [length=0, inodeId=0, path=/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015, replication=3, mtime=1415671845582, atime=1415670522749, blockSize=134217728, blocks=[], permissions=oozie:hdfs:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=240823292]
java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
...
2014-11-12 04:26:32,641 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)

說找不到"/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015"這個文件。

而事實上,這個文件是臨時文件,不重要而且已經被刪除了。

但在上面,卻報告"ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation CloseOp",能夠看出是在加載edits文件,執行OP_CLOSE操做時提示找不到文件。

剛開始咱們懷疑可能只是Standby上的fsimage文件或edits文件有問題,因而咱們在Standby上執行了bootstrapStandby,改過程會自動從Active Namenode上獲取最新的fsimage文件,並從Journalnode日誌服務器上下載並執行新的edits文件。

sudo -u hdfs hadoop namenode -bootstrapStandby

可是,在初始化以後,加載edits時仍然遇到上面相同的報錯。

而接下來,因爲我嘗試將ZKFC(Zookeeper Failover)服務重啓,致使了Active Namenode自動切換到Standby,但因爲Standby沒法take over,因此Active Namenode切換回去的時候,也沒法正常重啓了,錯誤跟Standby啓動時同樣。

這樣一來,整個Hadoop集羣就掛了,在搜遍了整個Google都找不到任何有用的信息以後,我打電話給了老大,老大也經過上面的錯誤Google不到任何一條有用的信息。因而老大嘗試在edits文件中grep上面的路徑,找到了一些相關的edits文件: 

# cd /data1/dfs/nn/
# cp -rpa current current.backup.orig
# cd /data2/dfs/nn/
# cp -rpa current current.backup.orig
# cd /data1/dfs/nn/current
# grep attempt_1415171013961_37060_m_000015_0 *
Binary file edits_0000000000240687057-0000000000240698453 matches
Binary file edits_0000000000240823073-0000000000240838096 matches
Binary file edits_inprogress_0000000000244853266 matches

因而,咱們思考可否從這些edits文件或fsimage文件中找到一些線索。

而下面的兩篇文章中,提到了Hadoop自帶的針對fsimage和edits文件的反編譯工具:

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html

其中,關於edits文件的一些描述給咱們帶來了極大的但願:

In case there is some problem with hadoop cluster and the edits file is corrupted it is possible to save at least part of the edits file that is correct. This can be done by converting the binary edits to XML, edit it manually and then convert it back to binary.

經過以上描述,咱們瞭解到edits文件可能會corrupted,而反編譯以後手動修改,在編譯回二進制格式進行替換,能夠做爲一種解決辦法。因而咱們將上面找到的兩個關聯edits文件,將其複製出來並進行了反編譯: 

mkdir /tmp2/
cd /data1/dfs/nn
cp edits_0000000000240687057-0000000000240698453 /tmp2/
cp edits_0000000000240823073-0000000000240838096 /tmp2/
hdfs oev -i edits_0000000000240687057-0000000000240698453 -o edits_0000000000240687057-0000000000240698453.xml
hdfs oev -i edits_0000000000240823073-0000000000240838096 -o edits_0000000000240823073-0000000000240838096.xml

反編譯以後,生成了兩個XML文件,咱們在XML文件中搜索"/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015",找到了OP_CLOSE與OP_DELETE相關記錄:

<RECORD>
    <OPCODE>OP_DELETE</OPCODE>
    <DATA>
      <TXID>240818498</TXID>
      <LENGTH>0</LENGTH>
      <PATH>/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015</PATH>
      <TIMESTAMP>1415671972595</TIMESTAMP>
      <RPC_CLIENTID>4a38861d-3bee-40e6-abb6-d2b58f313781</RPC_CLIENTID>
      <RPC_CALLID>676</RPC_CALLID>
    </DATA>
  </RECORD>
<RECORD>
    <OPCODE>OP_CLOSE</OPCODE>
    <DATA>
      <TXID>240823292</TXID>
      <LENGTH>0</LENGTH>
      <INODEID>0</INODEID>
      <PATH>/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015</PATH>
      <REPLICATION>3</REPLICATION>
      <MTIME>1415671845582</MTIME>
      <ATIME>1415670522749</ATIME>
      <BLOCKSIZE>134217728</BLOCKSIZE>
      <CLIENT_NAME></CLIENT_NAME>
      <CLIENT_MACHINE></CLIENT_MACHINE>
      <PERMISSION_STATUS>
        <USERNAME>oozie</USERNAME>
        <GROUPNAME>hdfs</GROUPNAME>
        <MODE>420</MODE>
      </PERMISSION_STATUS>
    </DATA>
  </RECORD>

能夠看到,對於"/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000015_0/part-00015",OP_DELETE發生在了OP_CLOSE以前,所以執行OP_CLOSE時會提示"File does not exist"。

因而,咱們嘗試將OP_CLOSE這部分代碼,替換成其它的內容,好比無關痛癢的修改一個現有文件的權限,並保留TXID 240823292以確保edits文件的完整性。

<RECORD>
    <OPCODE>OP_SET_PERMISSIONS</OPCODE>
    <DATA>
      <TXID>240823292</TXID>
      <SRC>/user/oozie-heylinux/.staging/job_1415171013961_37194</SRC>
      <MODE>504</MODE>
    </DATA>
  </RECORD>

修改完成以後,再將XML文件反編譯回binary格式。

cd /tmp2/
cp edits_0000000000240823073-0000000000240838096.xml edits_0000000000240823073-0000000000240838096.xml.orig
vim edits_0000000000240823073-0000000000240838096.xml
hdfs oev -i edits_0000000000240823073-0000000000240838096.xml -o edits_0000000000240823073-0000000000240838096 -p binary

而後將binary文件同步到journalnode日誌服務器中:

cd /var/hadoop/data/dfs/jn/idc2prod/
cp -rpa current current.backup.1
cd /tmp2/
cp edits_0000000000240823073-0000000000240838096 /data1/dfs/nn/current/
cp edits_0000000000240823073-0000000000240838096 /data2/dfs/nn/current/
cp edits_0000000000240823073-0000000000240838096 /var/hadoop/data/dfs/jn/idc2prod/current/
scp edits_0000000000240823073-0000000000240838096 root@idc2-server2:/var/hadoop/data/dfs/jn/idc2prod/current/
scp edits_0000000000240823073-0000000000240838096 root@idc2-server3:/var/hadoop/data/dfs/jn/idc2prod/current/

而後啓動namenode服務,能夠發現,之間的錯誤已經不存在了,取而代之的已經變成了其它文件。

2014-11-12 08:57:13,053 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding stream 'http://idc2-server1.heylinux.com:8480/getJournal?jid=idc2prod&segmentTxId=240823073&storageInfo=-55%3A1838233660%3A0%3ACID-d77ea84b-1b24-4bc2-ad27-7d2384d222d6' to transaction ID 240299210
2014-11-12 08:57:13,063 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation CloseOp [length=0, inodeId=0, path=/user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000018_0/part-00018, replication=3, mtime=1415671845675, atime=1415670519537, blockSize=134217728, blocks=[], permissions=oozie:hdfs:rw-r--r--, aclEntries=null, clientName=, clientMachine=, opCode=OP_CLOSE, txid=240823337]
java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000018_0/part-00018
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
...

2014-11-12 08:57:16,847 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.FileNotFoundException: File does not exist: /user/dong/data/dpp/classification/gender/vw-output-train/2014-10-30-research-with-confict-fix-bug-rerun/_temporary/1/_temporary/attempt_1415171013961_37060_m_000018_0/part-00018
 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
...

那麼,接下來,就是重複以上動做,其中有時候能找到一部分規律,能夠批量將同一個目錄下反覆報錯的文件的OP_CLOSE都替換掉。但更多的時候則是隨機的文件,須要一次次修改XML文件,而後編譯成binary,再啓動namenode,進行鍼對性的修改,一直反覆的進行下去,直到Namenode可以成功啓動爲止。

咱們在具體的操做過程當中,還遇到過關於OP_ADD_BLOCK的錯誤,形成問題的緣由是因爲最後一個edits文件在反編譯回binary文件時出現一些關於OP_UPDATE_BLOCK的錯誤。

我將報錯的部分經過以上方式進行了替換,才成功的將edits文件反編譯回binary文件。

具體的解決辦法,就是根據"Encountered exception on operation AddBlockOp"定位到OP_ADD_BLOCK相關配置並替換便可。

2014-11-12 18:07:39,070 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/user/dong/data/dpp/classification/gender/vw-input/2014-10-30-research-with-no-confict-fix-bug-rerun/all_labelled/_temporary/1/_temporary/attempt_1415171013961_42350_m_001474_0/part-m-01474, penultimateBlock=NULL, lastBlock=blk_1109647729_35920089, RpcClientId=, RpcCallId=-2]
java.lang.IllegalStateException

最後,在Namenode啓動成功後,會報告不少Block丟失,解決辦法是經過fsck刪除這些錯誤的Block。

hadoop fsck / -files -blocks -locations | tee -a fsck.out

而後在fsck.out中獲取全部Block的信息,執行"hadoop fsck / -move"加Block進行刪除。

最後,退出safemode,生活再次變得美好起來。

hadoop dfsadmin -safemode leave"

Hadoop,Namenode