目前使用5臺服務器搭建了Kubernetes
集羣環境,監控、日誌採集均已落地,業務也手工遷移到集羣中順利運行,故須要將本來基於原生docker
環境的CICD流程遷移到Kubernetes
集羣中html
Kubernetes
集羣實現CICD有幾個顯著優點node
Deployment
自然支持滾動部署、結合其餘Kubernetes
特性還能實現藍綠部署、金絲雀部署等GitLab
與GitLab Runner
自然支持Kubernetes
集羣,支持runner
自動伸縮,減少資源佔用Kubernetes
版本:1.14linux
GitLab
版本:12.2.5git
GitLab-Runner
版本:12.1.0github
Docker
環境版本:17.03.1docker
原始環境的gitlab runner
經過手動執行官網提供的註冊命令和啓動命令,分紅兩部部署,須要較多的手工操做,而在Kubernetes
中,其支持使用Helm
一鍵部署,官方文檔以下json
GitLab Runner Helm Chartubuntu
其實官方文檔的指引並不清晰,許多配置在文檔中沒有介紹用法,推薦去其源碼倉庫查看詳細的參數使用文檔api
其中介紹了幾個關鍵配置,在後面修改工程的ci
配置文件時會用到
官方文檔介紹
Use docker-in-docker workflow with Docker executor
The second approach is to use the special docker-in-docker (dind) Docker image with all tools installed (
docker
) and run the job script in context of that image in privileged mode.Note:
docker-compose
is not part of docker-in-docker (dind). To usedocker-compose
in your CI builds, follow thedocker-compose
installation instructions.Danger: By enabling
--docker-privileged
, you are effectively disabling all of the security mechanisms of containers and exposing your host to privilege escalation which can lead to container breakout. For more information, check out the official Docker documentation on Runtime privilege and Linux capabilities.Docker-in-Docker works well, and is the recommended configuration, but it is not without its own challenges:
- When using docker-in-docker, each job is in a clean environment without the past history. Concurrent jobs work fine because every build gets it’s own instance of Docker engine so they won’t conflict with each other. But this also means jobs can be slower because there’s no caching of layers.
- By default, Docker 17.09 and higher uses
--storage-driver overlay2
which is the recommended storage driver. See Using the overlayfs driver for details.- Since the
docker:19.03.1-dind
container and the Runner container don’t share their root filesystem, the job’s working directory can be used as a mount point for child containers. For example, if you have files you want to share with a child container, you may create a subdirectory under/builds/$CI_PROJECT_PATH
and use it as your mount point (for a more thorough explanation, check issue #41227):
總之使用DinD
進行容器構建並不是不可行,但面臨許多問題,例如使用overlay2
網絡須要Docker版本高於 17.09
Using
docker:dind
Running the
docker:dind
also known as thedocker-in-docker
image is also possible but sadly needs the containers to be run in privileged mode. If you're willing to take that risk other problems will arise that might not seem as straight forward at first glance. Because the docker daemon is started as aservice
usually in your.gitlab-ci.yaml
it will be run as a separate container in your Pod. Basically containers in Pods only share volumes assigned to them and an IP address by which they can reach each other usinglocalhost
./var/run/docker.sock
is not shared by thedocker:dind
container and thedocker
binary tries to use it by default.To overwrite this and make the client use TCP to contact the Docker daemon, in the other container, be sure to include the environment variables of the build container:
DOCKER_HOST=tcp://localhost:2375
for no TLS connection.DOCKER_HOST=tcp://localhost:2376
for TLS connection.Make sure to configure those properly. As of Docker 19.03, TLS is enabled by default but it requires mapping certificates to your client. You can enable non-TLS connection for DIND or mount certificates as described in Use Docker In Docker Workflow wiht Docker executor
在Docker 19.03.1版本以後默認開啓了TLS
配置,在構建的環境變量中須要聲明,不然報鏈接不上docker
的錯誤,而且使用DinD
構建須要runner
開啓特權模式,以訪問主機的資源,而且因爲使用了特權模式,在Pod
中對runner
須要使用的資源限制將失效
目前官方提供另外一種方式在docker
容器中構建並推送鏡像,實現更加優雅,能夠實現無縫遷移,那就是kaniko
Building a Docker image with kaniko
其優點官網描述以下
在Kubernetes集羣中構建Docker映像的另外一種方法是使用kaniko。iko子
- 容許您構建沒有特權訪問權限的映像。
- 無需Docker守護程序便可工做。
在後面的實踐中會使用兩種方式構建Docker鏡像,可根據實際狀況選擇
拉取Helm
Gitlab-Runner
倉庫到本地,修改配置
將原有的gitlab-runner
配置遷移到Helm
中,遷移後以下
image: alpine-v12.1.0
imagePullPolicy: IfNotPresent
gitlabUrl: https://gitlab.fjy8018.top/
runnerRegistrationToken: "ZXhpuj4Dxmx2tpxW9Kdr"
unregisterRunners: true
terminationGracePeriodSeconds: 3600
concurrent: 10
checkInterval: 30
rbac:
create: true
clusterWideAccess: false
metrics:
enabled: true
listenPort: 9090
runners:
image: ubuntu:16.04
imagePullSecrets:
- name: registry-secret
locked: false
tags: "k8s"
runUntagged: true
privileged: true
pollTimeout: 180
outputLimit: 4096
cache: {}
builds: {}
services: {}
helpers: {}
resources:
limits:
memory: 2048Mi
cpu: 1500m
requests:
memory: 128Mi
cpu: 200m
affinity: {}
nodeSelector: {}
tolerations: []
hostAliases:
- ip: "192.168.1.13"
hostnames:
- "gitlab.fjy8018.top"
- ip: "192.168.1.30"
hostnames:
- "harbor.fjy8018.top"
podAnnotations: {}
複製代碼
其中配置了私鑰、內網harbor
地址、harbor
拉取資源私鑰,資源限制策略
選擇runner
鏡像爲alpine-v12.1.0
,這一點單獨說一下,目前最新的runner版本爲12.5.0,但其有許多問題,alpine
新版鏡像在Kubernetes
中間斷髮生沒法解析DNS
的問題,反映到GitLab-Runner
中就是Could not resolve host
和server misbehaving
查閱解決方法
經過查詢發現,其官方倉庫還有多個相關issue沒有關閉
官方gitlab:Kubernetes runner: Could not resolve host
stackoverflow:Gitlab Runner is not able to resolve DNS of Gitlab Server
給出的解決方案無一例外都是降級到alpine-v12.1.0
We had same issue for couple of days. We tried change CoreDNS config, move runners to different k8s cluster and so on. Finally today i checked my personal runner and found that i'm using different version. Runners in cluster had
gitlab/gitlab-runner:alpine-v12.3.0
, when mine hadgitlab/gitlab-runner:alpine-v12.0.1
. We added lineimage: gitlab/gitlab-runner:alpine-v12.1.0 複製代碼
in
values.yaml
and this solved problem for us
其問題的根源應該在於alpine基礎鏡像對Kubernetes 集羣支持有問題,
ndots breaks DNS resolving #64924
docker-alpine
倉庫對應也有未關閉的issue
,其中就提到了關於DNS
解析超時和異常的問題
一行命令安裝便可
$ helm install /root/gitlab-runner/ --name k8s-gitlab-runner --namespace gitlab-runner
複製代碼
輸出以下
NAME: k8s-gitlab-runner
LAST DEPLOYED: Tue Nov 26 21:51:57 2019
NAMESPACE: gitlab-runner
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
k8s-gitlab-runner-gitlab-runner 5 0s
==> v1/Deployment
NAME READY UP-TO-DATE AVAILABLE AGE
k8s-gitlab-runner-gitlab-runner 0/1 1 0 0s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
k8s-gitlab-runner-gitlab-runner-744d598997-xwh92 0/1 Pending 0 0s
==> v1/Role
NAME AGE
k8s-gitlab-runner-gitlab-runner 0s
==> v1/RoleBinding
NAME AGE
k8s-gitlab-runner-gitlab-runner 0s
==> v1/Secret
NAME TYPE DATA AGE
k8s-gitlab-runner-gitlab-runner Opaque 2 0s
==> v1/ServiceAccount
NAME SECRETS AGE
k8s-gitlab-runner-gitlab-runner 1 0s
NOTES:
Your GitLab Runner should now be registered against the GitLab instance reachable at: "https://gitlab.fjy8018.top/"
複製代碼
查看gitlab admin頁面,發現已經有一個runner成功註冊
若是本來的ci
文件是基於19.03 DinD
鏡像構建的則須要加上TLS
相關配置
image: docker:19.03
variables:
DOCKER_DRIVER: overlay
DOCKER_HOST: tcp://localhost:2375
DOCKER_TLS_CERTDIR: ""
...
複製代碼
其他配置保持不變,使用DinD構建
因爲使用k8s
集羣,而經過集羣部署須要使用kubectl
客戶端,故手動建立了一個kubectl
docker
鏡像,使用gitlab
觸發dockerhub
構建,構建內容公開透明,可放心使用,若有其它版本的構建需求也可提pull request,會在後面補充,目前用到的只有1.14.0
有kubectl
客戶端,還須要配置鏈接TLS
和鏈接帳戶
爲了保障安全,新建一個專門訪問該工程命名空間的ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
name: hmdt-gitlab-ci
namespace: hmdt
複製代碼
利用集羣提供的RBAC機制,爲該帳戶授予該命名空間的admin權限
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: hmdt-gitlab-role
namespace: hmdt
subjects:
- kind: ServiceAccount
name: hmdt-gitlab-ci
namespace: hmdt
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: admin
複製代碼
建立後在查詢其在k8s集羣中生成的惟一名稱,此處爲hmdt-gitlab-ci-token-86n89
$ kubectl describe sa hmdt-gitlab-ci -n hmdt
Name: hmdt-gitlab-ci
Namespace: hmdt
Labels: <none>
Annotations: kubectl.`Kubernetes`.io/last-applied-configuration:
{"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"hmdt-gitlab-ci","namespace":"hmdt"}}
Image pull secrets: <none>
Mountable secrets: hmdt-gitlab-ci-token-86n89
Tokens: hmdt-gitlab-ci-token-86n89
Events: <none>
複製代碼
而後根據上面的Secret找到CA證書
$ kubectl get secret hmdt-gitlab-ci-token-86n89 -n hmdt -o json | jq -r '.data["ca.crt"]' | base64 -d
複製代碼
再找到對應的 Token
$ kubectl get secret hmdt-gitlab-ci-token-86n89 -n hmdt -o json | jq -r '.data.token' | base64 -d
複製代碼
進入gitlab Kubernetes集羣配置頁面,填寫相關信息,讓gitlab自動鏈接上集羣環境
注意,須要將此處取消勾選,不然gitlab會自動建立新的用戶帳戶,而不使用已經建立好的用戶帳戶,在運行過程當中會報無權限錯誤
不取消致使的報錯以下,gitlab建立了新的用戶帳戶hmdt-prod-service-account
,但沒有操做指定命名空間的權限
建立環境
名稱和url能夠按需自定義
最終配置CI文件以下,該文件使用DinD
方式構建Dockerfile
image: docker:19.03
variables:
MAVEN_CLI_OPTS: "-s .m2/settings.xml --batch-mode -Dmaven.test.skip=true"
MAVEN_OPTS: "-Dmaven.repo.local=.m2/repository"
DOCKER_DRIVER: overlay
DOCKER_HOST: tcp://localhost:2375
DOCKER_TLS_CERTDIR: ""
SPRING_PROFILES_ACTIVE: docker
IMAGE_VERSION: "1.8.6"
DOCKER_REGISTRY_MIRROR: "https://XXX.mirror.aliyuncs.com"
stages:
- test
- package
- review
- deploy
maven-build:
image: maven:3-jdk-8
stage: test
retry: 2
script:
- mvn $MAVEN_CLI_OPTS clean package -U -B -T 2C
artifacts:
expire_in: 1 week
paths:
- target/*.jar
maven-scan:
stage: test
retry: 2
image: maven:3-jdk-8
script:
- mvn $MAVEN_CLI_OPTS verify sonar:sonar
maven-deploy:
stage: deploy
retry: 2
image: maven:3-jdk-8
script:
- mvn $MAVEN_CLI_OPTS deploy
docker-harbor-build:
image: docker:19.03
stage: package
retry: 2
services:
- name: docker:19.03-dind
alias: docker
before_script:
- docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
script:
- docker build --pull -t "$CI_REGISTRY_IMAGE:$IMAGE_VERSION" .
- docker push "$CI_REGISTRY_IMAGE:$IMAGE_VERSION"
- docker logout $CI_REGISTRY
deploy_live:
image: fjy8018/kubectl:v1.14.0
stage: deploy
retry: 2
environment:
name: prod
url: https://XXXX
script:
- kubectl version
- kubectl get pods -n hmdt
- cd manifests/
- sed -i "s/__IMAGE_VERSION_SLUG__/${IMAGE_VERSION}/" deployment.yaml
- kubectl apply -f deployment.yaml
- kubectl rollout status -f deployment.yaml
- kubectl get pods -n hmdt
複製代碼
若須要使用Kaniko
構建Dockerfile
,則配置以下
注意,其中依賴的鏡像gcr.io/kaniko-project/executor:debug
屬於谷歌鏡像倉庫,可能存在沒法拉取的狀況
image: docker:19.03
variables:
MAVEN_CLI_OPTS: "-s .m2/settings.xml --batch-mode -Dmaven.test.skip=true"
MAVEN_OPTS: "-Dmaven.repo.local=.m2/repository"
DOCKER_DRIVER: overlay
DOCKER_HOST: tcp://localhost:2375
DOCKER_TLS_CERTDIR: ""
SPRING_PROFILES_ACTIVE: docker
IMAGE_VERSION: "1.8.6"
DOCKER_REGISTRY_MIRROR: "https://XXX.mirror.aliyuncs.com"
cache:
paths:
- target/
stages:
- test
- package
- review
- deploy
maven-build:
image: maven:3-jdk-8
stage: test
retry: 2
script:
- mvn $MAVEN_CLI_OPTS clean package -U -B -T 2C
artifacts:
expire_in: 1 week
paths:
- target/*.jar
maven-scan:
stage: test
retry: 2
image: maven:3-jdk-8
script:
- mvn $MAVEN_CLI_OPTS verify sonar:sonar
maven-deploy:
stage: deploy
retry: 2
image: maven:3-jdk-8
script:
- mvn $MAVEN_CLI_OPTS deploy
docker-harbor-build:
stage: package
retry: 2
image:
name: gcr.io/kaniko-project/executor:debug
entrypoint: [""]
script:
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json - /kaniko/executor --context $CI_PROJECT_DIR --dockerfile $CI_PROJECT_DIR/Dockerfile --destination $CI_REGISTRY_IMAGE:$IMAGE_VERSION deploy_live:
image: fjy8018/kubectl:v1.14.0
stage: deploy
retry: 2
environment:
name: prod
url: https://XXXX
script:
- kubectl version
- kubectl get pods -n hmdt
- cd manifests/
- sed -i "s/__IMAGE_VERSION_SLUG__/${IMAGE_VERSION}/" deployment.yaml
- kubectl apply -f deployment.yaml
- kubectl rollout status -f deployment.yaml
- kubectl get pods -n hmdt
複製代碼
Kubernetes
中的runner會根據任務多少自動擴縮容,目前配置的上限爲10個
Grafana也能監控到集羣在構建過程當中的資源使用狀況
DinD
構建Dockerfile
結果Kaniko
構建Dockerfile
的結果執行部署時gitlab會自動注入配置好的kubectl
config
部署完成後可在環境配置頁中查看部署結果,只有成功的部署纔會被記錄