hadoop hdfs數據塊探索

1.文件存儲的位置node

示例查看apache

./bin/hadoop fsck /data/bb/bb.txt -files -blocks -racks –locationsoop

Image

blk_1076386829_2649976是meta文件名,具體如何找到這個meta文件,能夠經過find命令,從圖中咱們能夠看到文件存儲在117和229的二臺機器上,例如咱們登陸到117機器上。3d

首先到dfs.datanode.data.dir的路徑(若是忘記啦,能夠在$HADOOP_HOME/etc/hadoop/hdfs-site.xml中查看)日誌

個人機器配置以下:server

Image[7]

分別在3個目錄中執行find語句,示例命令以下:xml

find /data1/hdfs1/data/current/BP-236683338-10.207.0.217-1403487328282/current -name blk_1076386829_2649976.meta  blog

最終找到meta文件。截圖以下:ip

Image(1)

這樣也就找到了你的文件,能夠cat blk_1076386829查看 一下。hadoop

單純的模擬了其中一個數據塊損壞的狀況,數據塊損壞後,在該節點執行directoryscan以前(dfs.datanode.directoryscan.interval決定),都不會發現損壞,在向namenode報告數據塊信息以前(dfs.blockreport.intervalMsec決定),都不會恢復數據塊,當namenode收到塊信息後纔會採起恢復措施

真實的狀況確定會更復雜,能夠從這個簡單的過程當中瞭解開頭所說的兩個參數。

參數配置

hdfs-site.xml中的兩個主要參數配置入下

<property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>master:9001</value></property><property>
  <name>dfs.blockreport.intervalMsec</name>
    <value>600000</value>
      <description>Determines block reporting interval in milliseconds.</description></property><property>
  <name>dfs.datanode.directoryscan.interval</name>
    <value>600</value>    
</property>

都是10分鐘

日誌詳情

2016-06-14 21:48:51,083 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool BP-660628275-192.168.1.100-1464787466998 Total blocks: 1, missing metadata files:1, missing block files:1, missing blocks in memory:0, mismatched blocks:0
2016-06-14 21:48:51,084 WARN org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removed block 1073741825 from memory with missing block file on the disk
2016-06-14 21:49:17,168 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 1 blocks took 0 msec to generate and 1 msecs for RPC and NN processing
2016-06-14 21:49:17,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: sent block report, processed command:org.apache.hadoop.hdfs.server.protocol.FinalizeCommand@8a2db2
2016-06-14 21:49:20,977 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-660628275-192.168.1.100-1464787466998:blk_1073741825_1001 src: /192.168.1.101:53718 dest: /192.168.1.102:50010

2016-06-14 21:49:20,984 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received BP-660628275-192.168.1.100-1464787466998:blk_1073741825_1001 src: /192.168.1.101:53718 dest: /192.168.1.102:50010 of size 1366

相關文章
相關標籤/搜索