在yarn中對 yarn.nodemanager.local-dirs的狀態更新操做,定義在 LocalDirsHandlerService(org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService)相關類中,在nm啓動時,會啓動一個LocalDirsHandlerService服務,循環檢測yarn.nodemanager.local-dirs和yarn.nodemanager.log-dirs目錄的可用性,本質上實際上是用java.util.Timer 和java.util.TimerTask 實現的一個服務線程。java
LocalDirsHandlerService的內部類 MonitoringTimerTask擴展了TimerTask類node
經過MonitoringTimerTas的構造函數對進行初始化,好比獲取設置的yarn.nodemanager.log-dirs和yarn.nodemanager.local-dirs 設置有效的local路徑apache
這個線程經常使用的參數:ide
YarnConfiguration.NM_DISK_HEALTH_CHECK_INTERVAL_MS //yarn.nodemanager.disk-health-checker.interval-ms 默認是2分鐘 YarnConfiguration.NM_DISK_HEALTH_CHECK_ENABLE //yarn.nodemanager.disk-health-checker.enable 默認是開啓 YarnConfiguration.NM_MIN_HEALTHY_DISKS_FRACTION //yarn.nodemanager.disk-health-checker.min-healthy-disks 默認是0.25,即最少應該是1/4的設置路徑是正常的
在cdh4.6.0中,MonitoringTimerTask的構造函數以下:函數
public MonitoringTimerTask( Configuration conf) throws YarnException { localDirs = new DirectoryCollection( validatePaths(conf.getTrimmedStrings(YarnConfiguration.NM_LOCAL_DIRS))); logDirs = new DirectoryCollection( validatePaths(conf.getTrimmedStrings(YarnConfiguration.NM_LOG_DIRS))); localDirsAllocator = new LocalDirAllocator( YarnConfiguration.NM_LOCAL_DIRS); logDirsAllocator = new LocalDirAllocator( YarnConfiguration.NM_LOG_DIRS); }
而在cdh5.2.0中,構造函數多了兩個配置項oop
YarnConfiguration.NM_MAX_PER_DISK_UTILIZATION_PERCENTAGE //percentage of disk that can be used before the dir is taken out of the good dirs list //yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage 默認是100(這個值須要改爲小於100,好比80,不然容易出現磁盤滿地問題) YarnConfiguration.NM_MIN_PER_DISK_FREE_SPACE_MB //minimum space, in MB, that must be available on the disk for the dir to be marked as good //yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb 默認是0MB
在檢查localdirs的初始可用性會考慮到這兩個設置(validatePaths方法)測試
public MonitoringTimerTask( Configuration conf) throws YarnRuntimeException { float maxUsableSpacePercentagePerDisk = conf.getFloat( YarnConfiguration.NM_MAX_PER_DISK_UTILIZATION_PERCENTAGE, YarnConfiguration.DEFAULT_NM_MAX_PER_DISK_UTILIZATION_PERCENTAGE); long minFreeSpacePerDiskMB = conf.getLong( YarnConfiguration.NM_MIN_PER_DISK_FREE_SPACE_MB, YarnConfiguration.DEFAULT_NM_MIN_PER_DISK_FREE_SPACE_MB); localDirs = new DirectoryCollection( validatePaths(conf .getTrimmedStrings( YarnConfiguration.NM_LOCAL_DIRS)), maxUsableSpacePercentagePerDisk, minFreeSpacePerDiskMB); logDirs = new DirectoryCollection( validatePaths(conf.getTrimmedStrings( YarnConfiguration.NM_LOG_DIRS)), maxUsableSpacePercentagePerDisk, minFreeSpacePerDiskMB); localDirsAllocator = new LocalDirAllocator( YarnConfiguration.NM_LOCAL_DIRS); logDirsAllocator = new LocalDirAllocator( YarnConfiguration.NM_LOG_DIRS); }
local dirs的判斷線程會每隔一段時間對目錄的可用性進行測試,調用的方法是spa
checkDirs---->updateDirsAfterFailure--->areDisksHealthy
可用判斷主要是判斷錯誤的目錄佔配置目錄的比例,當yarn.nodemanager.local-dirs或者yarn.nodemanager.log-dirs異常目錄佔了必定百分比後,磁盤檢測就會失敗,nm就會拋出異常:線程
具體的判斷邏輯在areDisksHealthy方法中:server
public boolean areDisksHealthy() { if (! isDiskHealthCheckerEnabled) { //判斷是否開啓了磁盤狀態檢測的功能 return true; } int goodDirs = getLocalDirs().size(); int failedDirs = localDirs.getFailedDirs().size(); int totalConfiguredDirs = goodDirs + failedDirs; if (goodDirs/( float)totalConfiguredDirs < minNeededHealthyDisksFactor ) { //異常的yarn.nodemanager.local-dirs比例判斷 return false; // Not enough healthy local- dirs } goodDirs = getLogDirs().size(); failedDirs = logDirs.getFailedDirs().size(); totalConfiguredDirs = goodDirs + failedDirs; if (goodDirs/( float)totalConfiguredDirs < minNeededHealthyDisksFactor ) { //異常的yarn.nodemanager.log-dirs比例判斷 return false; // Not enough healthy log- dirs } return true; }