深刻理解SPDK之六: SPDK問題排查B篇

下面記錄SPDK常見錯誤的解決方法,以避免重複走彎路node

讀寫沒對齊 512B

現象:app

nvme_qpair.c: 137:nvme_io_qpair_print_command: *NOTICE*: WRITE sqid:1 cid:191 nsid:1 lba:0 len:65536

nvme_qpair.c: 306:nvme_qpair_print_completion: *NOTICE*: INVALID FIELD (00/02) sqid:1 cid:191 cdw0:0 sqhd:0002 p:1 m:0 dnr:1

解決辦法:分析代碼容許記錄:socket

TRACE: 09-12 10:45:02:   * 0 common/spdknvme_io.c:296] OP: Write, Offset:0, Size: 13

根據SPDK NVME 接口讀寫要求,必需512B對齊。改爲了512B 以後,上面報錯消失。ide

不是從大頁內存讀寫

現象code

starting write I/O failed, push back, reback to previous status
starting write I/O failed, push back, reback to previous status

解決辦法:
SPDK讀寫的內存必須是基於EAL 大頁申請的內存,這部份內存經過EAL DPDK庫可以映射到用戶態,若是用普通的內存,沒法作DMA以供NVME hardware queue 直接使用,所以須要檢查讀寫的接口使用的內存是否都是從大頁分配的。檢查了一下,果真這裏不符合預期,修改檢查容許正常。接口

大頁初始化失敗

現象:進程

Starting SPDK v19.04-pre / DPDK 18.08.0 initialization...
[ DPDK EAL parameters: append_demo -c 0x8 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 32 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Auto-detected process type: PRIMARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Cannot allocate memzone list
EAL: FATAL: Cannot init memzone

EAL: Cannot init memzone

 EAL: Cannot init memzone

Failed to initialize DPDK
Unable to initialize Spdk env

分析和解決方法:
檢查是否真的有大頁:ip

[root@szw scripts]# cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
0
[root@szwscripts]# cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
0

果真沒有大頁了,從新申請:內存

cd  spdk/script ; ./all_setup.sh config

控制器處於failed 狀態

Starting SPDK v19.04-pre / DPDK 18.08.0 initialization...
[ DPDK EAL parameters: append_demo -c 0x8 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ]
EAL: Detected 32 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Auto-detected process type: PRIMARY
EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket
TRACE: 08-13 19:24:01:   * 0 baidu/bce/cds/common/spdk_nvme_io.cpp:534] Initializing NVMe Controllers

TRACE: 08-13 19:24:01:   * 0 baidu/bce/cds/common/spdk_nvme_io.cpp:182] Attaching to 0000:b0:00.0

nvme_ctrlr.c:2170:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 2
nvme_ctrlr.c: 496:nvme_ctrlr_fail: *ERROR*: ctrlr 0000:b0:00.0 in failed state.
nvme.c: 423:nvme_init_controllers: *ERROR*: Failed to initialize SSD: 0000:b0:00.0
nvme_ctrlr.c: 553:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 10000 milliseconds

解決辦法:SPDK模式的盤已經有進程綁定了,當前進程沒法attach 到SPDK模式的盤。kill調全部使用SPDK模式盤的進程,從新拉起程序。ci

相關文章
相關標籤/搜索