OpenStack中的Multipath faulty device的成因及解決(part 2) OpenStack中的Multipath faulty device的成因及解決(part 1)

時間 2019-12-06

標籤 openstack multipath faulty device 成因解決简体版

原文原文鏈接

| 版權：本文版權歸做者和博客園共有，歡迎轉載，但未經做者贊成必須保留此段聲明，且在文章頁面明顯位置給出原文鏈接。若有問題，能夠郵件：wangxu198709@gmail.comhtml

簡介

在上次的文章OpenStack中的Multipath faulty device的成因及解決(part 1)中，我詳細解釋了fault device的成因，這篇文章重點介紹下os-brick中是如何在併發的狀況下，經過哪些具體的實現避免了faluty device的造成。linux

在講具體實現前，有必要提到Linux上SCSI Block device（塊設備）地址（尋址）的一些細節。git

Linux kernel中經過以下的層次來定位特定的LUN:github

SCSI adapter number [host]session
channel number [bus]併發
id number [target]tcp
lun [lun]post

更多細節能夠參考[SCSI Addressing]，也就是說，一個LUN能夠用 [host-bus(channel)-target-lun] 來表示。url

Linux每鏈接一個iscsi target，kernel都會在本地的 /sys/class/iscsi_host/host*/device/session 創建對應的目錄結構，用來表示一個SCSI的設備。spa

$ ls -l /sys/class/iscsi_host/host3/device/session1/
total 0
drwxr-xr-x 4 root root    0 Apr 21 21:54 connection1:0
drwxr-xr-x 3 root root    0 Apr 21 21:54 iscsi_session
drwxr-xr-x 2 root root    0 Apr 21 21:55 power
drwxr-xr-x 5 root root    0 Apr 21 21:54 target3:0:0
-rw-r--r-- 1 root root 4096 Apr 21 21:54 uevent

上面的 3:0:0 就是一個iSCSI target所在host:channel:target

BTW: 若是你看不到如上的目錄結構，你應該先要鏈接一個iSCSI target，下面是我鏈接的target：

$ sudo iscsiadm -m session
tcp: [1] 172.17.0.2:3260,1 tgt1 (non-flash)

方案

因爲在上篇已經介紹過，os-brick使用的是鏈接(connect_volume)和斷開(disconnect_volume)的時候，分別使用了 multipath -r 和 iscsiadm -m session -R

以上的命令會形成全部的iSCSI target對應的BUS的全部LUN都會被掃描一遍。

os-brick就對症下藥，根據用戶要鏈接的target和LUN，縮小掃描範圍，只掃描特定target上的特定LUN。

具體的過程以下：

1. 首先根據用戶的輸入的session id和LUN id找到對應的h-c-t-l（代碼LINK）：

 1     def get_hctl(self, session, lun):
 2         """Given an iSCSI session return the host, channel, target, and lun."""
 3         glob_str = '/sys/class/iscsi_host/host*/device/session' + session
 4         paths = glob.glob(glob_str + '/target*')
 5         if paths:
 6             __, channel, target = os.path.split(paths[0])[1].split(':')
 7         # Check if we can get the host
 8         else:
 9             target = channel = '-'
10             paths = glob.glob(glob_str)
11 
12         if not paths:
13             LOG.debug('No hctl found on session %s with lun %s', session, lun)
14             return None
15 
16         # Extract the host number from the path
17         host = paths[0][26:paths[0].index('/', 26)]
18         res = (host, channel, target, lun)
19         LOG.debug('HCTL %s found on session %s with lun %s', res, session, lun)
20         return res

上面的參數session就是 tcp: [1] 172.17.0.2:3260,1 tgt1 (non-flash) 中的[1],lun就是要鏈接的LUN的ID，通常由Cinder driver提供。

對於個人這個session，LUN=1對應的hctl爲： HCTL ('3', '0', '0', 1) found on session 1 with lun 1

2. 掃描時使用上面的htcl:（代碼link）

 1     def scan_iscsi(self, host, channel='-', target='-', lun='-'):
 2         """Send an iSCSI scan request given the host and optionally the ctl."""
 3         LOG.debug('Scanning host %(host)s c: %(channel)s, '
 4                   't: %(target)s, l: %(lun)s)',
 5                   {'host': host, 'channel': channel,
 6                    'target': target, 'lun': lun})
 7         self.echo_scsi_command('/sys/class/scsi_host/host%s/scan' % host,
 8                                '%(c)s %(t)s %(l)s' % {'c': channel,
 9                                                       't': target,
10                                                       'l': lun})