因爲aufs並未併入內核,故而目前只有Ubuntu系統上可以使用aufs做爲docker的存儲引擎,而其餘系統上使用lvm thin provisioning(overlayfs是一個和aufs相似的union filesystem,將來有可能進入內核,但目前尚未;Lvm snapshot are useful for doing e.g. backup of a snapshot, but regress badly in performance when you start having many snapshots of the same device.)。爲了實現lvm thin provisioning,docker啓動時會設置一個100G的sparse文件(
/var/lib/docker/devicemapper/devicemapper/data,元數據爲/var/lib/docker/devicemapper/devicemapper/metadata),並將其做爲devicemapper的存儲池,而全部容器都從該存儲池中分配默認10G的存儲空間使用,以下圖所示:
好比建立一個apache容器時devicemapper處理流程以下所示:
- Create a snapshot of the base device.
- Mount it and apply the changes in the fedora image.
- Create a snapshot based on the fedora device.
- Mount it and apply the changes in the apache image.
- Create a snapshot based on the apache device.
- Mount it and use as the root in the new container.
thin provisioning管理
使用lvm工具來建立一個thin pool:
dd if=/dev/zero of=lvm.img bs=1M count=100
losetup /dev/loop7 lvm.img
losetup -a
pvcreate /dev/loop7
vgcreate lvm_pool /dev/loop7
# create thin pool
lvcreate -L 80M -T lvm_pool/thin_pool
# create volume in thin pool
lvcreate -T lvm_pool/thin_pool -V 500M -n first_lv
docker啓動時建立的默認存儲池:
#dmsetup table docker-253:1-138011042-pool
0 209715200 thin-pool 7:2 7:1 128 32768 1 skip_block_zeroing # 209715200*512/1024/1024/1024=100GB
當啓動容器後,會從該池中分配10G出來:
#dmsetup table docker-253:1-138011042-641cdebd22b55f2656a560cd250e661ab181dcf2f5c5b78dc306df7ce62231f2
0 20971520 thin 253:2 166 # 20971520*512/1024/1024/1024=10GB
該10G存儲的分配過程爲:
dmsetup message /dev/mapper/docker-253:1-138011042-pool 0 "create_thin 166"
dmsetup create docker-253:1-138011042-641cdebd22b55f2656a560cd250e661ab181dcf2f5c5b78dc306df7ce62231f3 --table "0 20971520 thin /dev/mapper/docker-253:1-138011042-pool 166"
建立快照:
dmsetup suspend /dev/mapper/thin
dmsetup message /dev/mapper/yy_thin_pool 0 "create_snap 1 0"
dmsetup resume /dev/mapper/thin
dmsetup create snap --table "0 40960 thin /dev/mapper/yy_thin_pool 1"
docker服務在啓動的時候能夠配置devicemapper的啓動參數,docker -d --storage-opt dm.foo=bar,可選參數有如下幾個:
- dm.basesize 默認爲10G,限制容器和鏡像的大小
- dm.loopdatasize 存儲池大小,默認爲100G
- dm.datadev 存儲池設備,默認生成一個/var/lib/docker/devicemapper/devicemapper/data文件
- dm.loopmetadatasize 元數據大小,默認爲2G
- dm.metadatadev 元數據設備,默認生成一個/var/lib/docker/devicemapper/devicemapper/metadata文件
- dm.fs 文件系統,默認ext4
- dm.blocksize blocksize默認64K
- dm.blkdiscard 默認true
最後看看啓動一個容器後,該容器的配置是如何組織的。
每一個容器建立後都會將其基本配置寫入到/var/lib/docker/containers/中:
#ls /var/lib/docker/containers/49f19ee979f6bf125c62779dcabf3bdce310b13d22e5c826752db202e509154e -l
total 20
-rw------- 1 root root 0 Nov 18 16:31 49f19ee979f6bf125c62779dcabf3bdce310b13d22e5c826752db202e509154e-json.log
-rw-r--r-- 1 root root 1741 Nov 18 16:31 config.json
-rw-r--r-- 1 root root 368 Nov 18 16:31 hostconfig.json
-rw-r--r-- 1 root root 13 Nov 18 16:31 hostname
-rw-r--r-- 1 root root 175 Nov 18 16:31 hosts
-rw-r--r-- 1 root root 325 Nov 18 16:31 resolv.conf
分配10G空間後會將容器存儲配置寫入到如下兩個文件中:
# cd /var/lib/docker
#cat ./devicemapper/metadata/49f19ee979f6bf125c62779dcabf3bdce310b13d22e5c826752db202e509154e-init
{"device_id":174,"size":10737418240,"transaction_id":731,"initialized":false}
#cat ./devicemapper/metadata/49f19ee979f6bf125c62779dcabf3bdce310b13d22e5c826752db202e509154e
{"device_id":175,"size":10737418240,"transaction_id":732,"initialized":false}
而容器的rootfs會mount到/var/lib/docker/devicemapper/mnt/container_id下:
#mount | grep 49f1
/dev/mapper/docker-253:1-138011042-49f19ee979f6bf125c62779dcabf3bdce310b13d22e5c826752db202e509154e on /var/lib/docker/devicemapper/mnt/49f19ee979f6bf125c62779dcabf3bdce310b13d22e5c826752db202e509154e type ext4 (rw,relatime,discard,stripe=16,data=ordered)
參考文檔