hadoop balance均衡datanode存儲不起做用問題分析

  前段時間由於hadoop集羣各datanode空間使用率很不均衡,須要從新balance(主要是有後加入集羣的2臺機器磁盤空間比較大引發的),在執行以下語句:node

bin/start-balancer.sh -threshold 10

  後,日誌輸出以下:mysql

Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved
Mar 10, 2014 11:03:40 AM          0                 0 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:41 AM          1                 0 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:42 AM          2               443 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:43 AM          3               443 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:44 AM          4            891.85 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:45 AM          5            891.85 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:46 AM          6            891.85 KB            614.5 GB              20 GB
Mar 10, 2014 11:03:47 AM          7            891.85 KB           614.49 GB              20 GB
Mar 10, 2014 11:03:48 AM          8            891.85 KB           614.49 GB              20 GB
No block has been moved for 5 iterations. Exiting...
Balancing took 10.023 seconds

很明顯,balancer已經計算出要移動的數據量,可是就是沒有移動,這是爲何呢?sql

查看hadoop-mysql-balancer-master.log並無發現Error或者Warning,那隻能去看源碼了。ide

原來hadoop balancer在進行轉移block的時候是會判斷的,具體要求見下面的代碼:oop

 /* Decide if it is OK to move the given block from source to target
   * A block is a good candidate if
   * 1. the block is not in the process of being moved/has not been moved;
   * 2. the block does not have a replica on the target;
   * 3. doing the move does not reduce the number of racks that the block has
   */

private boolean isGoodBlockCandidate(Source source, 
      BalancerDatanode target, BalancerBlock block) {
    // check if the block is moved or not
    if (movedBlocks.contains(block)) {
        return false;
    }
    if (block.isLocatedOnDatanode(target)) {
      return false;
    }

    boolean goodBlock = false;
    if (cluster.isOnSameRack(source.getDatanode(), target.getDatanode())) {
      // good if source and target are on the same rack
      goodBlock = true;
    } else {
      boolean notOnSameRack = true;
      synchronized (block) {
        for (BalancerDatanode loc : block.locations) {
          if (cluster.isOnSameRack(loc.datanode, target.datanode)) {
            notOnSameRack = false;
            break;
          }
        }
      }
      if (notOnSameRack) {
        // good if target is target is not on the same rack as any replica
        goodBlock = true;
      } else {
        // good if source is on the same rack as on of the replicas
        for (BalancerDatanode loc : block.locations) {
          if (loc != source && 
              cluster.isOnSameRack(loc.datanode, source.datanode)) {
            goodBlock = true;
            break;
          }
        }
      }
    }
    return goodBlock;
  }
  

對照上面的3個要求,逐一排查未移動block的緣由:spa

(1)須要移動的block在本次balance的過程當中沒有被移動過------這條知足;日誌

(2)須要移動的block在目標機器上不存在------這條待驗證;code

(3)須要移動的block,在移動後不改變每一個機架上block的數量(注意,這是的數量不是總數量,是去重之後的block數量,例如,block的備份數是2,實際上是算一個惟一的block)------因爲集羣在配置的時候沒有添加機架感知腳本,因此默認狀況下,都在1個機架上,這條知足。xml

那如今就去集羣上驗證第二條,果不其然,發現不少block在後面加入的2臺機器上都已經存在,這還移動個屁啊,那邊都已經存在了,因此balancer移動進程就退出了。blog

 

解決方法:

1.使用以下命令

bin/hadoop fs -setRep -R / 2

將集羣中的block備份數同一設置成你在hdfs-site.xml中

<property>
<name>dfs.replication</name>
<value>2</value>
</property>

配置的備份數,而後重啓hadoop集羣,等hadoop完成校驗blcok之後再balance便可解決問題。

相關文章
相關標籤/搜索