| 版權:本文版權歸做者和博客園共有,歡迎轉載,但未經做者贊成必須保留此段聲明,且在文章頁面明顯位置給出原文鏈接。若有問題,能夠郵件:wangxu198709@gmail.comhtml
Multipath:這個多路徑軟件在Linux平臺普遍使用,它的功能就是能夠把一個快設備對應的多條路徑聚合成一個單一的multipath device。主要目的有以下兩點:node
多路徑冗餘(redundancy):當配置在Active/Passive模式下,只有一半的路徑會用來作IO,若是IO路徑上有任何失敗(包括,交換機故障,線路故障,後端存儲故障等),能夠自動切換的備用路線上,對上層應用作到基本無感知。linux
提升性能(Performance): 當配置在Active/Active模式下,因此路徑均可以用來跑IO(如以round-robin模式),能夠提升IO速率或者延時。git
multipath不是本文的重點,若有須要,請移步:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/dm_multipath/setup_overviewgithub
Multipath:這個多路徑軟件在Linux平臺普遍使用,在Debian/Ubuntu平臺能夠經過 sudo apt-get install multipath-tools 安裝, RedHat/CentOS 平臺能夠經過 sudo yum install device-mapper-multipath 安裝。後端
multipath.conf: multipath對於主流的存儲陣列都有默認的配置,能夠支持存儲陣列的不少自帶特性,如ALUA。固然用戶能夠在安裝好後,手動建立/etc/multipath.confsession
如下是VNX/Unity的參考配置(vnx cinder driver):併發
blacklist { # Skip the files under /dev that are definitely not FC/iSCSI devices # Different system may need different customization devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z][0-9]*" devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]" # Skip LUNZ device from VNX device { vendor "DGC" product "LUNZ" } } defaults { user_friendly_names no flush_on_last_del yes } devices { # Device attributed for EMC CLARiiON and VNX series ALUA device { vendor "DGC" product ".*" product_blacklist "LUNZ" path_grouping_policy group_by_prio path_selector "round-robin 0" path_checker emc_clariion features "1 queue_if_no_path" hardware_handler "1 alua" prio alua failback immediate } }
OpenStack中,multipath可使用在Nova和Cinder的節點上,提供對後端存儲的高可用訪問。在很早以前,這部分代碼是分別在Nova和Cinder項目裏面的,漸漸的爲了維護方便,就單獨擰出來一個項目:os-brickapp
os-brick裏面很重要的兩個interface是:connect_volume-負責連接一個存儲上的LUN或者disk,disconnect_volume-輔助斷開與存儲上一個LUN的連接。ide
當host上multipath軟件發現對應的host path不可訪問時,就會顯示爲faulty狀態。
關於全部狀態的描述,能夠參考:https://en.wikipedia.org/wiki/Linux_DM_Multipath
os-brick的代碼我選擇的是比較早期容易產生faulty device的版本:https://github.com/openstack/os-brick/blob/liberty-eol/os_brick/initiator/connector.py
1 @synchronized('connect_volume') 2 def connect_volume(self, connection_properties): 3 """Attach the volume to instance_name. 4 connection_properties for iSCSI must include: 5 target_portal(s) - ip and optional port 6 target_iqn(s) - iSCSI Qualified Name 7 target_lun(s) - LUN id of the volume 8 Note that plural keys may be used when use_multipath=True 9 """ 10 11 device_info = {'type': 'block'} 12 13 if self.use_multipath: 14 # Multipath installed, discovering other targets if available 15 try: 16 ips_iqns = self._discover_iscsi_portals(connection_properties) 17 except Exception: 18 raise exception.TargetPortalNotFound( 19 target_portal=connection_properties['target_portal']) 20 21 if not connection_properties.get('target_iqns'): 22 # There are two types of iSCSI multipath devices. One which 23 # shares the same iqn between multiple portals, and the other 24 # which use different iqns on different portals. 25 # Try to identify the type by checking the iscsiadm output 26 # if the iqn is used by multiple portals. If it is, it's 27 # the former, so use the supplied iqn. Otherwise, it's the 28 # latter, so try the ip,iqn combinations to find the targets 29 # which constitutes the multipath device. 30 main_iqn = connection_properties['target_iqn'] 31 all_portals = set([ip for ip, iqn in ips_iqns]) 32 match_portals = set([ip for ip, iqn in ips_iqns 33 if iqn == main_iqn]) 34 if len(all_portals) == len(match_portals): 35 ips_iqns = zip(all_portals, [main_iqn] * len(all_portals)) 36 37 for ip, iqn in ips_iqns: 38 props = copy.deepcopy(connection_properties) 39 props['target_portal'] = ip 40 props['target_iqn'] = iqn 41 self._connect_to_iscsi_portal(props) 42 43 self._rescan_iscsi() 44 host_devices = self._get_device_path(connection_properties) 45 else: 46 target_props = connection_properties 47 for props in self._iterate_all_targets(connection_properties): 48 if self._connect_to_iscsi_portal(props): 49 target_props = props 50 break 51 else: 52 LOG.warning(_LW( 53 'Failed to connect to iSCSI portal %(portal)s.'), 54 {'portal': props['target_portal']}) 55 56 host_devices = self._get_device_path(target_props) 57 58 # The /dev/disk/by-path/... node is not always present immediately 59 # TODO(justinsb): This retry-with-delay is a pattern, move to utils? 60 tries = 0 61 # Loop until at least 1 path becomes available 62 while all(map(lambda x: not os.path.exists(x), host_devices)): 63 if tries >= self.device_scan_attempts: 64 raise exception.VolumeDeviceNotFound(device=host_devices) 65 66 LOG.warning(_LW("ISCSI volume not yet found at: %(host_devices)s. " 67 "Will rescan & retry. Try number: %(tries)s."), 68 {'host_devices': host_devices, 69 'tries': tries}) 70 71 # The rescan isn't documented as being necessary(?), but it helps 72 if self.use_multipath: 73 self._rescan_iscsi() 74 else: 75 if (tries): 76 host_devices = self._get_device_path(target_props) 77 self._run_iscsiadm(target_props, ("--rescan",)) 78 79 tries = tries + 1 80 if all(map(lambda x: not os.path.exists(x), host_devices)): 81 time.sleep(tries ** 2) 82 else: 83 break 84 85 if tries != 0: 86 LOG.debug("Found iSCSI node %(host_devices)s " 87 "(after %(tries)s rescans)", 88 {'host_devices': host_devices, 'tries': tries}) 89 90 # Choose an accessible host device 91 host_device = next(dev for dev in host_devices if os.path.exists(dev)) 92 93 if self.use_multipath: 94 # We use the multipath device instead of the single path device 95 self._rescan_multipath() 96 multipath_device = self._get_multipath_device_name(host_device) 97 if multipath_device is not None: 98 host_device = multipath_device 99 LOG.debug("Unable to find multipath device name for " 100 "volume. Only using path %(device)s " 101 "for volume.", {'device': host_device}) 102 103 device_info['path'] = host_device 104 return device_info
其中重要的邏輯我都用紅色標記了,用來發現host上的塊設備device
1 @synchronized('connect_volume') 2 def disconnect_volume(self, connection_properties, device_info): 3 """Detach the volume from instance_name. 4 connection_properties for iSCSI must include: 5 target_portal(s) - IP and optional port 6 target_iqn(s) - iSCSI Qualified Name 7 target_lun(s) - LUN id of the volume 8 """ 9 if self.use_multipath: 10 self._rescan_multipath() 11 host_device = multipath_device = None 12 host_devices = self._get_device_path(connection_properties) 13 # Choose an accessible host device 14 for dev in host_devices: 15 if os.path.exists(dev): 16 host_device = dev 17 multipath_device = self._get_multipath_device_name(dev) 18 if multipath_device: 19 break 20 if not host_device: 21 LOG.error(_LE("No accessible volume device: %(host_devices)s"), 22 {'host_devices': host_devices}) 23 raise exception.VolumeDeviceNotFound(device=host_devices) 24 25 if multipath_device: 26 device_realpath = os.path.realpath(host_device) 27 self._linuxscsi.remove_multipath_device(device_realpath) 28 return self._disconnect_volume_multipath_iscsi( 29 connection_properties, multipath_device) 30 31 # When multiple portals/iqns/luns are specified, we need to remove 32 # unused devices created by logging into other LUNs' session. 33 for props in self._iterate_all_targets(connection_properties): 34 self._disconnect_volume_iscsi(props)
上面的紅色代碼塊,會把LUN對應的host path從kernel中,和multipath mapper中刪除。
注意到,以上兩個接口都是用的同一個叫(connect_volume)的鎖(其實就是用flock實現的Linux上的文件鎖)
爲了方便描述faulty device的產生,我畫了以下的圖,來表示兩個接口的關係
如上的流程在非併發的狀況下是表現正常的,host上的device均可以正常鏈接和清理。
可是,以上邏輯有個實現上的問題,當高併發狀況下,會產生faulty device, 考慮一下執行順序:
$ sudo multipath -ll 3600601601290380036a00936cf13e711 dm-30 DGC,VRAID size=2.0G features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=0 status=active | `- 11:0:0:151 sdef 128:112 failed faulty running `-+- policy='round-robin 0' prio=0 status=enabled `- 12:0:0:151 sdeg 128:128 failed faulty running 3600601601bd032007c097518e96ae411 dm-2 DGC,VRAID size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=0 status=active `- #:#:#:# - #:# active faulty running
通常來講,有#:#:#:#輸出的multipath是能夠直接用 sudo multipath -f 3600601601bd032007c097518e96ae411 刪除的。
做爲第一部分,到這裏faulty device的產生介紹完了,後面再找機會,介紹下在os-brick中如何儘可能避免faulty device的出現。
參考資料:
RedHat官方multipath的介紹:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/dm_multipath/mpio_description
EMC VNX driver doc:https://docs.openstack.org/cinder/queens/configuration/block-storage/drivers/dell-emc-vnx-driver.html
Go實現的塊設備鏈接工具:https://github.com/peter-wangxu/goock
iSCSI Faulty Device Cleanup Script for VNX:https://github.com/emc-openstack/vnx-faulty-device-cleanup