某一臺跑批服務器硬盤沒法正常讀寫,提示input/output error,服務器天天均需使用,詢問狀況後發現服務器首先爲硬盤故障,更換硬盤後提示此錯誤(RAID已正常同步)
服務器
出現問題,先查看日誌,收集日誌進行分析查看,日誌分析結果以下:日誌
[12922471.544897] smartpqi 0000:5e:00.0: reset of scsi 14:1:0:3: SUCCESS
[12922471.545034] sd 14:1:0:3: [sdd] Medium access timeout failure. Offlining disk!
...
[12922471.546144] blk_update_request: I/O error, dev sdd, sector 2351217920
[12922471.546473] sd 14:1:0:3: rejecting I/O to offline device
[12922471.547836] XFS (sdd1): metadata I/O error: block 0x8bbac400 ("xlog_iodone") error 5 numblks 512
[12922471.547840] XFS (sdd1): xfs_do_force_shutdown(0x2) called from line 1200 of file fs/xfs/xfs_log.c. Return address = 0xffffffffc07a1ea0
[12922471.547866] XFS (sdd1): Log I/O Error Detected. Shutting down filesystem
[12922471.547868] XFS (sdd1): Please umount the filesystem and rectify the problem(s)
[12922471.547870] XFS (sdd1): metadata I/O error: block 0x8bbac600 ("xlog_iodone") error 5 numblks 512
[12922471.547872] XFS (sdd1): xfs_do_force_shutdown(0x2) called from line 1200 of file fs/xfs/xfs_log.c. Return address = 0xffffffffc07a1ea0
[12922471.547891] XFS (sdd1): metadata I/O error: block 0x2bc1a6c0 ("xfs_trans_read_buf_map") error 5 numblks 32
[12922471.547898] XFS (sdd1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
[12922471.548349] XFS (sdd1): metadata I/O error: block 0xc65b63f8 ("xfs_trans_read_buf_map") error 5 numblks 8
[12922471.548390] XFS (sdd1): metadata I/O error: block 0x8bdb5820 ("xfs_trans_read_buf_map") error 5 numblks 32
[12922471.548408] XFS (sdd1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
[12922471.548412] XFS (sdd1): metadata I/O error: block 0x11771540 ("xfs_trans_read_buf_map") error 5 numblks 32
[12922471.548417] XFS (sdd1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
...
[15351852.339037] sd 14:1:0:3: rejecting I/O to offline devicecode
- 查看日誌發現磁盤已經offline,而且文件系統已經異常.
- 1.手動將此硬盤設置爲online
# echo running > /sys/block/sdd/device/state
- 2.查詢是否爲running
cat /sys/block/sdd/device/state
- 3.修復文件系統,並確認硬盤處於umount狀態(視狀況而定,如沒法umount則只能進行重啓,我是進行的重啓操做)
- 4.開始修復
XFS : Corruption detected. Unmount and run xfs_repair
官方文檔以下:https://access.redhat.com/solutions/1194613- 5.按照上述方法修復完成後,再進行mount操做