Nomad 是一個管理機器集羣並在集羣上運行應用程序的工具。
參考以前的一篇《Consul 搭建集羣》準備三臺虛機。node
ip | |
---|---|
n1 | 172.20.20.10 |
n2 | 172.20.20.11 |
n3 | 172.20.20.12 |
登陸到虛機n1,切換用戶到rootlinux
» vagrant ssh n1 su [vagrant@n1 ~]$ su Password: [root@n1 vagrant]#
安裝一些依賴的工具nginx
[root@n1 vagrant]# yum install -y epel-release [root@n1 vagrant]# yum install -y jq [root@n1 vagrant]# yum install -y unzip
下載0.8.1版本到/tmp目錄下redis
最新的0.8.3版本和consul結合會有反覆註冊服務的bug,這裏使用0.8.1
[root@n1 vagrant]# cd /tmp/ [root@n1 vagrant]# curl -s https://releases.hashicorp.com/nomad/0.8.1/nomad_0.8.1_linux_amd64.zip -o nomad.zip
解壓,並賦予nomad可執行權限,最後把nomad移動到/usr/bin/下docker
[root@n1 vagrant]# unzip nomad.zip [root@n1 vagrant]# chmod +x nomad [root@n1 vagrant]# mv nomad /usr/bin/nomad
檢查nomad是否安裝成功bootstrap
[root@n1 vagrant]# nomad Usage: nomad [-version] [-help] [-autocomplete-(un)install] <command> [args] Common commands: run Run a new job or update an existing job stop Stop a running job status Display the status output for a resource alloc Interact with allocations job Interact with jobs node Interact with nodes agent Runs a Nomad agent Other commands: acl Interact with ACL policies and tokens agent-info Display status information about the local agent deployment Interact with deployments eval Interact with evaluations namespace Interact with namespaces operator Provides cluster-level tools for Nomad operators quota Interact with quotas sentinel Interact with Sentinel policies server Interact with servers ui Open the Nomad Web UI version Prints the Nomad version
出現如上所示表明安裝成功。api
參考以前的一篇《Consul 搭建集羣》批量安裝這一節。瀏覽器
使用以下腳本可批量安裝nomad,並同時爲每一個虛機安裝好docker。ssh
$script = <<SCRIPT echo "Installing dependencies ..." yum install -y epel-release yum install -y net-tools yum install -y wget yum install -y jq yum install -y unzip yum install -y bind-utils echo "Determining Consul version to install ..." CHECKPOINT_URL="https://checkpoint-api.hashicorp.com/v1/check" if [ -z "$CONSUL_DEMO_VERSION" ]; then CONSUL_DEMO_VERSION=$(curl -s "${CHECKPOINT_URL}"/consul | jq .current_version | tr -d '"') fi echo "Fetching Consul version ${CONSUL_DEMO_VERSION} ..." cd /tmp/ curl -s https://releases.hashicorp.com/consul/${CONSUL_DEMO_VERSION}/consul_${CONSUL_DEMO_VERSION}_linux_amd64.zip -o consul.zip echo "Installing Consul version ${CONSUL_DEMO_VERSION} ..." unzip consul.zip sudo chmod +x consul sudo mv consul /usr/bin/consul sudo mkdir /etc/consul.d sudo chmod a+w /etc/consul.d echo "Determining Nomad 0.8.1 to install ..." #CHECKPOINT_URL="https://checkpoint-api.hashicorp.com/v1/check" #if [ -z "$NOMAD_DEMO_VERSION" ]; then # NOMAD_DEMO_VERSION=$(curl -s "${CHECKPOINT_URL}"/nomad | jq .current_version | tr -d '"') #fi echo "Fetching Nomad version ${NOMAD_DEMO_VERSION} ..." cd /tmp/ curl -s https://releases.hashicorp.com/nomad/0.8.1/nomad_0.8.1_linux_amd64.zip -o nomad.zip echo "Installing Nomad version 0.8.1 ..." unzip nomad.zip sudo chmod +x nomad sudo mv nomad /usr/bin/nomad echo "Installing nginx ..." #yum install -y nginx echo "Installing docker ..." yum install -y docker SCRIPT
首先啓動consul組成一個集羣,具體參考《Consul 搭建集羣》。若是用默認的配置,nomad啓動後會檢測本機的Consul並自動的講nomad服務註冊。curl
n1
[root@n1 vagrant]# consul agent -server -bootstrap-expect 3 -data-dir /etc/consul.d -node=node1 -bind=172.20.20.10 -ui -client 0.0.0.0
n2
[root@n2 vagrant]# consul agent -server -bootstrap-expect 3 -data-dir /etc/consul.d -node=node2 -bind=172.20.20.11 -ui -client 0.0.0.0 -join 172.20.20.10
n3
[root@n3 vagrant]# consul agent -server -bootstrap-expect 3 -data-dir /etc/consul.d -node=node3 -bind=172.20.20.12 -ui -client 0.0.0.0 -join 172.20.20.10
[root@n1 vagrant]# consul members Node Address Status Type Build Protocol DC Segment node1 172.20.20.10:8301 alive server 1.1.0 2 dc1 <all> node2 172.20.20.11:8301 alive server 1.1.0 2 dc1 <all> node3 172.20.20.12:8301 alive server 1.1.0 2 dc1 <all>
定義server的配置文件server.hcl
log_level = "DEBUG" bind_addr = "0.0.0.0" data_dir = "/home/vagrant/data_server" name = "server1" advertise { http = "172.20.20.10:4646" rpc = "172.20.20.10:4647" serf = "172.20.20.10:4648" } server { enabled = true # Self-elect, should be 3 or 5 for production bootstrap_expect = 3 }
在命令行中執行
[root@n1 vagrant]# nomad agent -config=server.hcl
進入到n2,n3 執行
nomad agent -config=server.hcl
打開瀏覽器 http://172.20.20.10:8500/ui/#/dc1/services
從consul中能看到nomad都以啓動
再打開nomad自帶的UI http://172.20.20.10:4646/ui/servers
能夠看到server都已運行
在啓動client以前須要先啓動docker
,client執行job須要用到docker。
[root@n1 vagrant]# systemctl start docker
在n2,n3 也須要啓動
定義client的配置文件client.hcl
log_level = "DEBUG" data_dir = "/home/vagrant/data_clinet" name = "client1" advertise { http = "172.20.20.10:4646" rpc = "172.20.20.10:4647" serf = "172.20.20.10:4648" } client { enabled = true servers = ["172.20.20.10:4647"] } ports { http = 5656 }
在n1中輸入命令
[root@n1 vagrant]# nomad agent -config=client.hcl
打開瀏覽器 http://172.20.20.10:8500/ui/#/dc1/services/nomad-client
能夠看到nomad-client已經啓動成功,同理在n2,n3也運行client。
最終顯示以下
進入到n2,新建一個文件夾job,運行nomad init
[root@n2 vagrant]# mkdir job [root@n2 vagrant]# cd job/ [root@n2 job]# nomad init Example job file written to example.nomad
以上命令新建了一個example的Job
命令行鍵入
[root@n2 job]# nomad run example.nomad ==> Monitoring evaluation "97f8a1fe" Evaluation triggered by job "example" Evaluation within deployment: "3c89e74a" Allocation "47bf1f20" created: node "9df69026", group "cache" Evaluation status changed: "pending" -> "complete" ==> Evaluation "97f8a1fe" finished with status "complete"
能夠看到節點爲9df69026
的client去執行了Job
[root@n1 vagrant]# nomad server members Name Address Port Status Leader Protocol Build Datacenter Region server1.global 172.20.20.10 4648 alive false 2 0.8.1 dc1 global server2.global 172.20.20.11 4648 alive false 2 0.8.1 dc1 global server3.global 172.20.20.12 4648 alive true 2 0.8.1 dc1 global
[root@n1 vagrant]# nomad status example ID = example Name = example Submit Date = 2018-06-13T08:42:57Z Type = service Priority = 50 Datacenters = dc1 Status = running Periodic = false Parameterized = false Summary Task Group Queued Starting Running Failed Complete Lost cache 0 0 1 0 0 0 Latest Deployment ID = 3c89e74a Status = successful Description = Deployment completed successfully Deployed Task Group Desired Placed Healthy Unhealthy cache 1 1 1 0 Allocations ID Node ID Task Group Version Desired Status Created Modified 47bf1f20 9df69026 cache 0 run running 8m44s ago 8m26s ago
編輯 example.nomad 找到 count = 1
修改成 count = 3
在命令行中查看Job的變動計劃
[root@n2 job]# nomad plan example.nomad +/- Job: "example" +/- Task Group: "cache" (2 create, 1 in-place update) +/- Count: "1" => "3" (forces create) Task: "redis" Scheduler dry-run: - All tasks successfully allocated. Job Modify Index: 70 To submit the job with version verification run: nomad job run -check-index 70 example.nomad When running the job with the check-index flag, the job will only be run if the server side version matches the job modify index returned. If the index has changed, another user has modified the job and the plan's results are potentially invalid.
執行Job的變動任務
[root@n2 job]# nomad job run -check-index 70 example.nomad ==> Monitoring evaluation "3a0ff5e0" Evaluation triggered by job "example" Evaluation within deployment: "2b5b803f" Allocation "34086acb" created: node "6166e031", group "cache" Allocation "4d01cd92" created: node "f97b5095", group "cache" Allocation "47bf1f20" modified: node "9df69026", group "cache" Evaluation status changed: "pending" -> "complete" ==> Evaluation "3a0ff5e0" finished with status "complete"
能夠看到又多了兩個client節點去執行Job任務
在瀏覽器中能夠看到一共有3個實例
同時也能看到Job的版本記錄
[root@n2 job]# nomad status example ID = example Name = example Submit Date = 2018-06-13T08:56:03Z Type = service Priority = 50 Datacenters = dc1 Status = running Periodic = false Parameterized = false Summary Task Group Queued Starting Running Failed Complete Lost cache 0 0 3 0 0 0 Latest Deployment ID = 2b5b803f Status = successful Description = Deployment completed successfully Deployed Task Group Desired Placed Healthy Unhealthy cache 3 3 3 0 Allocations ID Node ID Task Group Version Desired Status Created Modified 34086acb 6166e031 cache 1 run running 3m38s ago 3m25s ago 4d01cd92 f97b5095 cache 1 run running 3m38s ago 3m26s ago 47bf1f20 9df69026 cache 1 run running 16m43s ago 3m27s ago
首先中止n1的nomad server,Ctrl-C
在n2上查詢members
[root@n2 job]# nomad server members Name Address Port Status Leader Protocol Build Datacenter Region server1.global 172.20.20.10 4648 failed false 2 0.8.1 dc1 global server2.global 172.20.20.11 4648 alive true 2 0.8.1 dc1 global server3.global 172.20.20.12 4648 alive false 2 0.8.1 dc1 global
server1 的狀態爲 failed,此時將server1 移出集羣
[root@n2 job]# nomad server force-leave server1.global [root@n2 job]# nomad server members Name Address Port Status Leader Protocol Build Datacenter Region server1.global 172.20.20.10 4648 left false 2 0.8.1 dc1 global server2.global 172.20.20.11 4648 alive true 2 0.8.1 dc1 global server3.global 172.20.20.12 4648 alive false 2 0.8.1 dc1 global
server1的狀態爲left,移出集羣成功。