Hadoop源碼學習筆記之NameNode啓動場景流程五：磁盤空間檢查及安全模式檢查

時間 2020-05-22

標籤 hadoop 源碼學習筆記 namenode 啓動場景流程磁盤空間檢查安全模式欄目 Hadoop 简体版

原文原文鏈接

本篇內容關注NameNode啓動以前，active狀態和standby狀態的一些後臺服務及準備工做，即源碼裏的CommonServices。主要包括磁盤空間檢查、node

可用資源檢查、安全模式等。依然分爲三部分：源碼調用分析、僞代碼核心梳理、調用關係圖解。安全

第一部分，源碼調用分析。ide

　　接着上篇RpcServer啓動以後開始梳理，進入到了initialize()方法中。函數

protected void initialize(Configuration conf) throws IOException { // 能夠經過找到下面變量名的映射，在hdfs-default.xml中找到對應的配置
  if (conf.get(HADOOP_USER_GROUP_METRICS_PERCENTILES_INTERVALS) == null) { String intervals = conf.get(DFS_METRICS_PERCENTILES_INTERVALS_KEY); if (intervals != null) { conf.set(HADOOP_USER_GROUP_METRICS_PERCENTILES_INTERVALS, intervals); } } ...... // 核心代碼：啓動HttpServer
  if (NamenodeRole.NAMENODE == role) { startHttpServer(conf); } this.spanReceiverHost = SpanReceiverHost.getInstance(conf); // 核心代碼：FSNamesystem初始化
 loadNamesystem(conf); // 核心代碼：建立一個rpc server實例
  rpcServer = createRpcServer(conf); ...... // 核心代碼：啓動一些服務組件，包括rpc server等
 startCommonServices(conf); }

　　本篇的核心代碼，在最後一行的startCommonServices()，咱們進去看一下都start了哪些service。ui

/** * Start services common to both active and standby states */
  void startCommonServices(Configuration conf, HAContext haContext) throws IOException { ......
　　 try { // 建立了一個NameNodeResourceChecker對象，用來檢查namenode所在機器上的磁盤空間是否足夠
      nnResourceChecker = new NameNodeResourceChecker(conf); // 檢查可用資源是否足夠：若是不夠，日誌打印警告信息，而後進入安全模式
 checkAvailableResources(); // 判斷是否進入安全模式(安全模式是否實例化)，而且副本隊列是否應該被同步/複製(populate怎麼翻譯？)
      assert safeMode != null && !isPopulatingReplQueues(); // 目前NameNode啓動，進入到safemode階段，處於一個等待彙報blocks的狀態
      StartupProgress prog = NameNode.getStartupProgress(); prog.beginPhase(Phase.SAFEMODE); prog.setTotal(Phase.SAFEMODE, STEP_AWAITING_REPORTED_BLOCKS, getCompleteBlocksTotal()); // 設置block數量
 setBlockTotal(); // 啓動BlockManager裏面的一堆後臺線程
 blockManager.activate(conf); } finally { writeUnlock(); } ...... }

　　這段代碼里加粗的就是主要的核心邏輯，一步一步按順序梳理。先從第一段核心代碼，即new NameNodeResourceChecker()來看，這行代碼實例化了this

　　一個NameNodeResourceChecker對象，這個NameNodeResourceChecker類從名字猜想，大概是實現一個資源檢查的功能。咱們點進去驗證spa

　　一下咱們的猜測:線程

/** * * NameNodeResourceChecker provides a method - * <code>hasAvailableDiskSpace</code> - which will return true if and only if * the NameNode has disk space available on all required volumes, and any volume * which is configured to be redundant. Volumes containing file system edits dirs * are added by default, and arbitrary extra volumes may be configured as well. * * 提供了一個hasAvailableDiskSpace()方法， * 這個方法僅僅在NameNode有足夠的磁盤空間知足全部磁盤使用需求的時候 * 返回true，包括文件系統的edits dir，包括配置文件內設置的dirs。 * 不然，返回false. */ @InterfaceAudience.Private public class NameNodeResourceChecker { ...... // Space (in bytes) reserved per volume.
  // 配置文件內的每一個被配置的磁盤路徑空間的默認大小
  private final long duReserved;
 // 一個map，用來存放全部應該檢查的配置路徑，在NameNodeResourceChecker實例化的時候進行初始化 private Map<String, CheckedVolume> volumes; ...... @VisibleForTesting class CheckedVolume implements CheckableNameNodeResource { ...... @Override public boolean isResourceAvailable() { long availableSpace = df.getAvailable(); ...... // duReserved就是默認配置好的最小須要的磁盤空間 // duReserved空間默認是100Mb，默認狀況下edits目錄起碼要有100Mb空間來寫入日誌， // 不然就是檢查資源失敗
      if (availableSpace < duReserved) { LOG.warn("Space available on volume '" + volume + "' is "
            + availableSpace +
            ", which is below the configured reserved amount " + duReserved); return false; } else { return true; } } ...... } /** * Create a NameNodeResourceChecker, which will check the edits dirs and any * additional dirs to check set in <code>conf</code>. * * 構造方法。 * 建立一個NameNodeResourceChecker對象，初始化volumes Map，用來存放須要進行磁盤空間檢查的路徑。
 * 經過將這個volumes的map傳遞給NameNodeResourcePolicy類進行遍歷volumes Map來進行檢查edits dirs
 * 和一些配置文件中配置的dirs。 */
  public NameNodeResourceChecker(Configuration conf) throws IOException { this.conf = conf;
 volumes = new HashMap<String, CheckedVolume>(); ...... // 從配置文件中獲取DFS_NAMENODE_DU_RESERVED_KEY配置
    duReserved = conf.getLong(DFSConfigKeys.DFS_NAMENODE_DU_RESERVED_KEY, DFSConfigKeys.DFS_NAMENODE_DU_RESERVED_DEFAULT); ...... } /** * Add the volume of the passed-in directory to the list of volumes to check. * If <code>required</code> is true, and this volume is already present, but * is marked redundant, it will be marked required. If the volume is already * present but marked required then this method is a no-op. * * @param directoryToCheck * The directory whose volume will be checked for available space. */
  private void addDirToCheck(URI directoryToCheck, boolean required) throws IOException { ......
　　 CheckedVolume newVolume = new CheckedVolume(dir, required);
　　 CheckedVolume volume = volumes.get(newVolume.getVolume());
　　 if(volume == null || !volume.isRequired()){
　　　　volumes.put(newVolume.getVolume(), newVolume);
　　 } } /** * Return true if disk space is available on at least one of the configured * redundant volumes, and all of the configured required volumes. * * @return True if the configured amount of disk space is available on at * least one redundant volume and all of the required volumes, false * otherwise. * 若是磁盤空間知足全部配置要求的大小而且至少知足一個配置的冗餘大小就返回true */
  public boolean hasAvailableDiskSpace() { return NameNodeResourcePolicy.areResourcesAvailable(volumes.values(), minimumRedundantVolumes); } ...... }

　　從NameNodeResourceChecker這個類的註釋上能夠清楚的得知，就是爲了檢查配置裏的路徑的磁盤空間大小是否知足使用需求的。它最核心的翻譯

　　兩個變量一個是duReserved，一個是volumes。下面一一來分析：日誌

　　duReserved, 這是一個long型變量，經過duReserved = conf.getLong(DFSConfigKeys.DFS_NAMENODE_DU_RESERVED_KEY,...);這行代碼，

　　能夠溯源到hdfs-default.xml中的dfs.namenode.resource.du.reserved這個配置選項，配置文件中默認爲空，可是getLong()方法有一個默認大小，

　　是100M。那些待檢查空間大小的路徑首先會同這個duReserved進行比較，小於duReserved的話，返回false。最終會致使的結果是什麼呢？先打個

　　疑問，一下子會根據梳理的代碼進行解疑。

　　volumes, 這是一個map，經過addDirToCheck()方法及註釋可知，主要是用來存放須要進行磁盤空間檢查的dirs。而後在hasAvailableDiskSpace()

　　中將這個map傳入到了真正負責磁盤檢查邏輯的類：NameNodeResourcePolicy，這個類就是用來遍歷volumes這個map進行資源檢查的。

　　前面提出的一個疑問：返回false結果會致使什麼狀況出現呢？經過跟蹤hasAvailableDiskSpace()方法的被調用代碼，一步一步進行梳理，最終會

　　進入到FSNamesystem類的一個內部線程類中：

 /** * Periodically calls hasAvailableResources of NameNodeResourceChecker, and if * there are found to be insufficient resources available, causes the NN to * enter SAFE MODE. If resources are later found to have returned to * acceptable levels, this daemon will cause the NN to exit safe mode. */
  class NameNodeResourceMonitor implements Runnable { boolean shouldNNRmRun = true; @Override public void run () { try { while (fsRunning && shouldNNRmRun) { checkAvailableResources(); if(!nameNodeHasResourcesAvailable()) { String lowResourcesMsg = "NameNode low on available disk space. "; if (!isInSafeMode()) { FSNamesystem.LOG.warn(lowResourcesMsg + "Entering safe mode."); } else { FSNamesystem.LOG.warn(lowResourcesMsg + "Already in safe mode."); } // 進入到安全模式
            enterSafeMode(true); }
　　　　　　...... } } catch (Exception e) { FSNamesystem.LOG.error("Exception in NameNodeResourceMonitor: ", e); } }

　　註釋告訴咱們，這個線程會週期性的調用NameNodeResourceChecker的hasAvailableRecourses()方法進行磁盤資源檢查，若是一旦發現有資源

　　不足的狀況，會使NameNode進入安全模式。若是隨後返回的狀態表明資源大小到達可以使用的級別，那麼這個線程就使NameNode退出安全模式。

　　依照這個註釋，去解讀run()方法的代碼邏輯：在一個while循環裏，首先判斷資源是否可用，若是不可用，日誌裏就會發出一個警告信息，而後調用

　　enterSafeMode();進入安全模式。

　　爲何磁盤資源不足要進入安全模式呢？很簡單，磁盤資源不足的狀況下，任何對元數據修改所產生的日誌都沒法確保可以寫入到磁盤，即新產生的

　　edits log和fsimage都沒法確保寫入磁盤。因此要進入安全模式，來禁止元數據的變更以免往磁盤寫入新的日誌數據。

　　那麼具體的進入安全模式方法裏都有哪些操做，能夠本身進去enterSafeMode()方法中看一下。代碼就不在這兒貼出來了，主要涉及到一些日誌文件的

　　同步，而後在同步完成以後進行加鎖處理禁止全部寫的操做。

　回到startCommonServices()中來，繼續下一個核心邏輯的分析，isPopulatingReplQueres();

 /** * Check if replication queues are to be populated * @return true when node is HAState.Active and not in the very first safemode */ @Override public boolean isPopulatingReplQueues() { if (!shouldPopulateReplQueues()) { return false; } return initializedReplQueues; } private boolean shouldPopulateReplQueues() { if(haContext == null || haContext.getState() == null) return false; return haContext.getState().shouldPopulateReplQueues(); }

　　這個方法是用來檢查副本隊列是否應該被同步/複製(populate應該怎麼翻譯啊？大概就是這麼個意思吧)，返回的是副本隊列是否已經初始化的狀態。

　　繼續往下看，NameNode.getStartupProgress()這行代碼會獲取一個StarupProgress實例，這個類用來指示NameNode進程內各項任務的啓動狀況和

　　進度的，好比一個任務某個階段的開始和結束信息等。而且是線程安全的。

第二部分，僞代碼核心梳理。

　　NameNode.main() // 入口函數
　　　　|——createNameNode(); // 經過new NameNode()進行實例化
　　　　　　|——initialize(); // 方法進行初始化操做
　　　　　　　　|——startHttpServer(); // 啓動HttpServer
　　　　　　　　|——loadNamesystem(); // 加載元數據
　　　　　　　　|——createRpcServer(); // 建立並初始化rpc server實例
　　　　　　　　|——startCommonServices();
　　　　　　　　　　|——namesystem.startCommonServices(); // 啓動一些磁盤檢查、安全模式等一些後臺服務及線程
　　　　　　　　　　　　|——new NameNodeResourceChecker(); // 實例化一個NameNodeResourceChecker並準備出全部須要檢查的磁盤路徑
　　　　　　　　　　　　|——checkAvailableResources(); // 開始磁盤空間檢查
　　　　　　　　　　　　|——NameNode.getStartupProgress(); // 獲取StartupProgress實例用來獲取NameNode各任務的啓動信息
　　　　　　　　　　　　|——setBlockTotal(); // 設置全部的block，用於後面判斷是否進入安全模式
　　　　　　　　　　　　|——blockManager.activate(); // 啓動BlockManager裏面的一堆關於block副本處理的後臺線程
　　　　　　　　　　|——rpcServer.start(); // 啓動rpcServer
　　　　|——join()

第三部分，調用關係圖解。