初試Docker swarm命令node
測試環境:nginx
系統構成:docker
使用Docker for Mac升級Docker很簡單,這裏很少說了;升級以後確認一下是最新版的Docker就能夠了。網絡
首先,使用Docker Machine建立3臺主機,一臺名爲 manager1
,其他兩臺爲 worker1
和 worker2
:curl
建立master節點:tcp
$ docker-machine create --driver virtualbox manager1
建立2個worker節點:測試
$ docker-machine create --driver virtualbox worker1 $ docker-machine create --driver virtualbox worker2
以後能夠確認一下3臺機器是否建立成功,以及ta們的IP地址:ui
$ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS manager1 - virtualbox Running tcp://192.168.99.100:2376 v1.12.0-rc2 worker1 - virtualbox Running tcp://192.168.99.101:2376 v1.12.0-rc2 worker2 - virtualbox Running tcp://192.168.99.102:2376 v1.12.0-rc2
首先,切換到 manager1
主機,使用 docker swarm init
命令建立一個集羣:lua
$ eval $(docker-machine env manager1) $ docker swarm init --listen-addr 192.168.99.100:2377 Swarm initialized: current node (alq8w7fi34f41j3z4ise1vkd7) is now a manager.
使用 docker info
確認一下當前節點信息,能夠看到Swarm屬性部分:url
$ docker info ... ... Swarm: active NodeID: alq8w7fi34f41j3z4ise1vkd7 IsManager: Yes Managers: 1 Nodes: 1 CACertHash: sha256:7a9d0eb1621afe2be07c5fd405b8f038c76be3a8dc7b2c73944a2d1ab9dffd76 ... ...
在2臺worker節點上,經過 docker swarm join
命令加入到剛纔建立的集羣中:
$ eval $(docker-machine env worker1) $ docker swarm join 192.168.99.100:2377 This node joined a Swarm as a worker. $ docker info ... ... Swarm: active NodeID: 4l2a9ebgmcpwqlo0roye0n6m5 IsManager: No ... ...
回到 manager1
節點,確認一下集羣中節點的個數和狀態。在 MANAGER STATUS
屬性中,能夠看到誰是Leader:
$ eval $(docker-machine env manager1) $ docker node ls ID NAME MEMBERSHIP STATUS AVAILABILITY MANAGER STATUS 4l2a9ebgmcpwqlo0roye0n6m5 worker1 Accepted Ready Active alq8w7fi34f41j3z4ise1vkd7 * manager1 Accepted Ready Active Leader e0khh79c6owm0e14mli602q0a worker2 Accepted Ready Active
首先,建立一個覆蓋網絡:
$ docker network create -d overlay ngx_net 8kiv8muduf60f66rs99ufo25f
而後使用這個覆蓋網絡,建立nginx服務:
$ docker service create --name nginx --replicas 1 --network ngx_net -p 80:80/tcp nginx 78cmmh8ef4qcwmjjzgn3k45ch
這樣,就建立了一個具備一個副本( --replicas 1
)的 nginx
服務,使用鏡像 nginx
。
$ docker service tasks nginx ID NAME SERVICE IMAGE LAST STATE DESIRED STATE NODE 394by1b72i44a44jms2xwk6ud nginx.1 nginx nginx Preparing 9 seconds Running manager1
注意上面的 STATE
字段中剛開始的服務狀態爲 Preparing
,須要等一會才能變爲 Running
狀態,其中最費時間的應該是下載鏡像的過程。
過一會再查看服務狀態,就能夠看到狀態已經變爲 Running
了,這是能夠經過 http://192.168.99.100/
查看 Nginx服務。
$ docker service tasks nginx ID NAME SERVICE IMAGE LAST STATE DESIRED STATE NODE 394by1b72i44a44jms2xwk6ud nginx.1 nginx nginx Running About a minute Running manager1 # 經過curl查看服務是否正常運行 $ curl http://192.168.99.100/ ...
固然,若是隻是經過service啓動容器,swarm也算不上什麼新鮮東西了。Service還提供了複製(相似k8s裏的副本)功能。能夠經過 docker service scale
命令來設置服務中容器的副本數:
$ docker service scale nginx=5 nginx scaled to 5
和建立服務同樣,增長scale數以後,將會建立新的容器,這些新啓動的容器也會經歷從準備到運行的過程,過一分鐘左右,服務應該就會啓動完成,這時候能夠再來看一下 nginx
服務中的容器(task):
$ docker service tasks nginx ID NAME SERVICE IMAGE LAST STATE DESIRED STATE NODE 394by1b72i44a44jms2xwk6ud nginx.1 nginx nginx Running 5 minutes Running manager1 co0re9u7infoo9qiegm6yiqcn nginx.2 nginx nginx Running 2 minutes Running worker1 19dvayah8fjz3vykrl2oi12uu nginx.3 nginx nginx Running 2 minutes Running worker1 d8okdip767972p083tix4dk7d nginx.4 nginx nginx Running 2 minutes Running manager1 9rq59mf6bq5m411y6gdzb5pq6 nginx.5 nginx nginx Running 2 minutes Running worker2
能夠看到,以前 nginx
容器只在 manager1
上有一個實例,而如今又增長了4個實例。
咱們能夠在 manager1
上查看一下這臺主機上運行的容器:
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 4bbc02bc426e nginx:latest "nginx -g 'daemon off" 4 minutes ago Up 4 minutes 80/tcp, 443/tcp nginx.4.d8okdip767972p083tix4dk7d 1eda6f7d3029 nginx:latest "nginx -g 'daemon off" 5 minutes ago Up 5 minutes 80/tcp, 443/tcp nginx.1.394by1b72i44a44jms2xwk6ud
若是一個服務中的一個任務忽然終止了,Docker會怎麼處理?這裏咱們就來模擬某一容器異常終止的狀況。
在操做以前,咱們在另外一個窗口,準備好使用 tail -f /var/log/docker.log
命令來查看Docker守護進程的日誌。
咱們經過 docker rm -f
來刪除一個容器(4b
表示的是咱們第一個啓動的 4bbc02bc426e
容器):
$ docker rm -f 4b 4b
回到Docker守護進程的日誌查看窗口,咱們會看到相似這樣的日誌(有刪減),從日誌中,咱們能夠看到舊容器(id爲 4bbc02bc426e
,task id爲 d8okdip767972p083tix4dk7d
)的刪除和新容器(task id爲 cumjdktbadaxca66rt4hi63na
)的調度過程(刪除了日誌中的時間戳和日誌級別等非重要信息,但保留了日誌的時間順序):
# 調用刪除容器的API msg="Calling DELETE /v1.24/containers/4b?force=1" # d8的任務從RUNNING變爲了FAILED狀態 msg="state changed" module=taskmanager state.desired=RUNNING state.transition="RUNNING->FAILED" task.id=d8okdip767972p083tix4dk7d # 將舊任務中止 msg=assigned module=agent task.desiredstate=SHUTDOWN task.id=d8okdip767972p083tix4dk7d # 分配到新的節點,建立新的任務id: cumj msg="Assigning to node e0khh79c6owm0e14mli602q0a" task.id=cumjdktbadaxca66rt4hi63na # 舊任務狀態更新爲FAILED msg="(*Agent).UpdateTaskStatus" module=agent task.id=d8okdip767972p083tix4dk7d msg="task status updated" method="(*Dispatcher).processTaskUpdates" module=dispatcher state.transition="FAILED->FAILED" task.id=d8okdip767972p083tix4dk7d # 新任務從ASSIGNED變爲接受狀態 msg="task status updated" method="(*Dispatcher).processTaskUpdates" module=dispatcher state.transition="ASSIGNED->ACCEPTED" task.id=cumjdktbadaxca66rt4hi63na # 準備運行新任務 msg="task status updated" method="(*Dispatcher).processTaskUpdates" module=dispatcher state.transition="ACCEPTED->PREPARING" task.id=cumjdktbadaxca66rt4hi63na # 運行新任務 msg="task status updated" method="(*Dispatcher).processTaskUpdates" module=dispatcher state.transition="PREPARING->STARTING" task.id=cumjdktbadaxca66rt4hi63na # 新任務啓動完成 msg="task status updated" method="(*Dispatcher).processTaskUpdates" module=dispatcher state.transition="STARTING->RUNNING" task.id=cumjdktbadaxca66rt4hi63na
從上面的日誌咱們不難看出,一個任務的生命週期的前半生,大概就是 ASSIGNED
-> ACCEPTED
-> PREPARING
-> STARTING
-> RUNNING
。
在 manager1
主機上,咱們看到這個容器已經刪除了, ps
只能看到一個容器:
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1eda6f7d3029 nginx:latest "nginx -g 'daemon off" 6 minutes ago Up 6 minutes 80/tcp, 443/tcp nginx.1.394by1b72i44a44jms2xwk6ud
再來查看一下 nginx
服務的任務列表,能夠看到新建立的任務 cumjdktbadaxca66rt4hi63na
被調度到了 worker2
上運行:
$ docker service tasks nginx ID NAME SERVICE IMAGE LAST STATE DESIRED STATE NODE 394by1b72i44a44jms2xwk6ud nginx.1 nginx nginx Running 8 minutes Running manager1 co0re9u7infoo9qiegm6yiqcn nginx.2 nginx nginx Running 5 minutes Running worker1 19dvayah8fjz3vykrl2oi12uu nginx.3 nginx nginx Running 5 minutes Running worker1 cumjdktbadaxca66rt4hi63na nginx.4 nginx nginx Running 32 seconds Running worker2 9rq59mf6bq5m411y6gdzb5pq6 nginx.5 nginx nginx Running 5 minutes Running worker2
不難想象,若是一個節點都宕機了,則Docker應該會將在該節點運行的容器,調度到其餘節點,以知足指定數量的副本保持運行狀態。
下面咱們就來模擬一下這種場景。
首先,咱們刪除一個節點 worker2
:
$ docker-machine rm worker2 About to remove worker2 Are you sure? (y/n): y Successfully removed worker2
刪除以後,Docker就會開始從新調度,最終調度結束(< 1分鐘)後,再查看該服務的任務狀態,應該以下面這樣,有5個 nginx
容器在剩下的兩臺機器上運行:
$ docker service tasks nginx ID NAME SERVICE IMAGE LAST STATE DESIRED STATE NODE 394by1b72i44a44jms2xwk6ud nginx.1 nginx nginx Running 22 minutes Running manager1 co0re9u7infoo9qiegm6yiqcn nginx.2 nginx nginx Running 19 minutes Running worker1 19dvayah8fjz3vykrl2oi12uu nginx.3 nginx nginx Running 19 minutes Running worker1 991v97eg9q1hnnzxda6c9mmv7 nginx.4 nginx nginx Running 20 seconds Running manager1 e9yztfmy5luaxnadz80e5j8nl nginx.5 nginx nginx Running 20 seconds Running manager1
除了上面用到的一些命令, docker service
還有如下一些子命令:
$ docker service --help Usage: docker service COMMAND Manage Docker services Options: --help Print usage Commands: create Create a new service inspect Inspect a service tasks List the tasks of a service ls List services rm Remove a service scale Scale one or multiple services update Update a service Run 'docker service COMMAND --help' for more information on a command.
好比咱們能夠用 docker service inspect
來得到 nginx
服務的詳情信息:
$ docker service inspect nginx [ { "ID": "78cmmh8ef4qcwmjjzgn3k45ch", "Version": { "Index": 32 }, "CreatedAt": "2016-06-21T08:31:21.427244594Z", "UpdatedAt": "2016-06-21T08:33:50.75625288Z", "Spec": { "Name": "nginx", "TaskTemplate": { "ContainerSpec": { "Image": "nginx" }, "Resources": { "Limits": {}, "Reservations": {} }, "RestartPolicy": { "Condition": "any", "MaxAttempts": 0 }, "Placement": {} }, "Mode": { "Replicated": { "Replicas": 5 } }, "UpdateConfig": {}, "Networks": [ { "Target": "8kiv8muduf60f66rs99ufo25f" } ], "EndpointSpec": { "Mode": "vip", "Ports": [ { "Protocol": "tcp", "TargetPort": 80, "PublishedPort": 80 } ] } }, "Endpoint": { "Spec": {}, "Ports": [ { "Protocol": "tcp", "TargetPort": 80, "PublishedPort": 80 } ], "VirtualIPs": [ { "NetworkID": "e9et32s47olva4e0uisamdqao", "Addr": "10.255.0.6/16" }, { "NetworkID": "8kiv8muduf60f66rs99ufo25f", "Addr": "10.0.0.2/24" } ] } } ]
也能夠經過 docker service rm nginx
命令,刪除 nginx
服務(如今的版本刪除服務前沒有警告提示,請當心操做)。
上手很簡單,Docker swarm能夠很是方便的建立相似k8s那樣帶有副本的服務,確保必定數量的容器運行,保證服務的高可用。
然而,光從官方文檔來講,功能彷佛又有些簡單,從生產環境來講,下面這些方面都還有所欠缺(其實從Swarm v1就有這個問題):
不過,正如Docker讓容器技術變得平民化同樣,Docker Machine和Swarm,也將在各類基礎設施上運行Docker和Docker集羣變得更加簡單,從這一點上來講,其意義也是很大的。
不過在開源社區和商業競爭的角度來看,Docker Swarm將會走向何方呢?