[Pytorch]docker共享內存問題

時間 2020-08-17

原文原文鏈接

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)

問題

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)

出現這個錯誤的狀況是，在服務器上的docker中運行訓練代碼時，batch size設置得過大，shared memory不夠（由於docker限制了shm）.git

根據PyTorch README：github

Please note that PyTorch uses shared memory to share data between processes, so if torch multiprocessing is used (e.g. for multithreaded data loaders) the default shared memory segment size that container runs with is not enough, and you should increase shared memory size either with --ipc=host or --shm-size command line options to nvidia-docker run.