下面記錄SPDK常見錯誤的解決方法,以避免重複走彎路node
現象:app
nvme_qpair.c: 137:nvme_io_qpair_print_command: *NOTICE*: WRITE sqid:1 cid:191 nsid:1 lba:0 len:65536 nvme_qpair.c: 306:nvme_qpair_print_completion: *NOTICE*: INVALID FIELD (00/02) sqid:1 cid:191 cdw0:0 sqhd:0002 p:1 m:0 dnr:1
解決辦法:分析代碼容許記錄:socket
TRACE: 09-12 10:45:02: * 0 common/spdknvme_io.c:296] OP: Write, Offset:0, Size: 13
根據SPDK NVME 接口讀寫要求,必需512B對齊。改爲了512B 以後,上面報錯消失。ide
現象code
starting write I/O failed, push back, reback to previous status starting write I/O failed, push back, reback to previous status
解決辦法:
SPDK讀寫的內存必須是基於EAL 大頁申請的內存,這部份內存經過EAL DPDK庫可以映射到用戶態,若是用普通的內存,沒法作DMA以供NVME hardware queue 直接使用,所以須要檢查讀寫的接口使用的內存是否都是從大頁分配的。檢查了一下,果真這裏不符合預期,修改檢查容許正常。接口
現象:進程
Starting SPDK v19.04-pre / DPDK 18.08.0 initialization... [ DPDK EAL parameters: append_demo -c 0x8 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ] EAL: Detected 32 lcore(s) EAL: Detected 1 NUMA nodes EAL: Auto-detected process type: PRIMARY EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket EAL: No free hugepages reported in hugepages-1048576kB EAL: Cannot allocate memzone list EAL: FATAL: Cannot init memzone EAL: Cannot init memzone EAL: Cannot init memzone Failed to initialize DPDK Unable to initialize Spdk env
分析和解決方法:
檢查是否真的有大頁:ip
[root@szw scripts]# cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages 0 [root@szwscripts]# cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages 0
果真沒有大頁了,從新申請:內存
cd spdk/script ; ./all_setup.sh config
Starting SPDK v19.04-pre / DPDK 18.08.0 initialization... [ DPDK EAL parameters: append_demo -c 0x8 --base-virtaddr=0x200000000000 --file-prefix=spdk0 --proc-type=auto ] EAL: Detected 32 lcore(s) EAL: Detected 1 NUMA nodes EAL: Auto-detected process type: PRIMARY EAL: Multi-process socket /var/run/dpdk/spdk0/mp_socket TRACE: 08-13 19:24:01: * 0 baidu/bce/cds/common/spdk_nvme_io.cpp:534] Initializing NVMe Controllers TRACE: 08-13 19:24:01: * 0 baidu/bce/cds/common/spdk_nvme_io.cpp:182] Attaching to 0000:b0:00.0 nvme_ctrlr.c:2170:nvme_ctrlr_process_init: *ERROR*: Initialization timed out in state 2 nvme_ctrlr.c: 496:nvme_ctrlr_fail: *ERROR*: ctrlr 0000:b0:00.0 in failed state. nvme.c: 423:nvme_init_controllers: *ERROR*: Failed to initialize SSD: 0000:b0:00.0 nvme_ctrlr.c: 553:nvme_ctrlr_shutdown: *ERROR*: did not shutdown within 10000 milliseconds
解決辦法:SPDK模式的盤已經有進程綁定了,當前進程沒法attach 到SPDK模式的盤。kill調全部使用SPDK模式盤的進程,從新拉起程序。ci