Rancher中基於Kubernetes的CRD實現

前言

        2017年Kubernetes在容器編排領域一統江湖,伴隨着Kubernetes的發展壯大,及其CNCF基金會的生態發展,對整個雲計算領域的發展必將產生深遠的影響。有了Kubernetes的強力加持,雲計算中長期不被看好和重視的PAAS層,也正逐漸發揮威力。不少廠商開始基於Kubernetes開發本身的PAAS平臺,這其中感受比較有表明性的有Openshift和Rancher。本文主要針對Rancher進行介紹和相應的源碼分析,瞭解和學習Rancher是如何基於Kubernetes進行PAAS平臺的開發。node

        在Rancher的1.x.x版本中,主打自家的cattle容器編排引擎,同時還支持kubernetes、mesos和swarm,而到了現在的2.0.0-beta版,則只剩下了kubernetes,能夠說是順應時勢。Kubernetes在1.7版本後增長了CustomResourceDefinition(CRD),即用戶自定義資源類型,使得開發人員能夠不修改Kubernetes的原有代碼,而是經過擴展形式,來管理自定義資源對象。Rancher 2.0版本正是利用這一特性,來完成對Kubernetes的擴展及業務邏輯的實現。git

代碼分析

        接下來分析代碼,Rancher在2.0版本應用golang進行開發,首先從main.go開始。main.go中主要是應用"github.com/urfave/cli"建立一個cli應用,而後運行run()方法:github

func run(cfg app.Config) error {
    dump.GoroutineDumpOn(syscall.SIGUSR1, syscall.SIGILL)
    ctx := signal.SigTermCancelContext(context.Background())

    embedded, ctx, kubeConfig, err := k8s.GetConfig(ctx, cfg.K8sMode, cfg.KubeConfig) if err != nil {
        return err
    }
    cfg.Embedded = embedded

    os.Unsetenv("KUBECONFIG")
    kubeConfig.Timeout = 30 * time.Second
    return app.Run(ctx, *kubeConfig, &cfg)
}

        在run()方法中,會建立一個內建的kubernetes集羣(這部份內容不在文中進行具體分析),建立這個集羣須要先生成一個plan,定義了集羣運行在哪些節點,啓動哪些進程,應用哪些參數等,在代碼中對plan進行打印輸出以下:golang

nodes:
- address: 127.0.0.1
  processes:
    etcd:
      name: etcd
      command: []
      args:
      - /usr/local/bin/etcd
      - --peer-client-cert-auth
      - --client-cert-auth
      - --data-dir=/var/lib/rancher/etcd
      - --initial-cluster-token=etcd-cluster-1
      - --advertise-client-urls=https://127.0.0.1:2379,https://127.0.0.1:4001
      - --initial-cluster-state=new
      - --peer-trusted-ca-file=/etc/kubernetes/ssl/kube-ca.pem
      - --name=etcd-master
      - --peer-cert-file=/etc/kubernetes/ssl/kube-etcd-127-0-0-1.pem
      - --peer-key-file=/etc/kubernetes/ssl/kube-etcd-127-0-0-1-key.pem
      - --listen-client-urls=https://0.0.0.0:2379
      - --initial-advertise-peer-urls=https://127.0.0.1:2380
      - --trusted-ca-file=/etc/kubernetes/ssl/kube-ca.pem
      - --cert-file=/etc/kubernetes/ssl/kube-etcd-127-0-0-1.pem
      - --key-file=/etc/kubernetes/ssl/kube-etcd-127-0-0-1-key.pem
      - --listen-peer-urls=https://0.0.0.0:2380
      - --initial-cluster=etcd-master=https://127.0.0.1:2380
      env: []
      image: rancher/coreos-etcd:v3.0.17
      imageregistryauthconfig: ""
      volumesfrom: []
      binds:
      - /var/lib/etcd:/var/lib/rancher/etcd:z
      - /etc/kubernetes:/etc/kubernetes:z
      networkmode: host
      restartpolicy: always
      pidmode: ""
      privileged: false
      healthcheck:
        url: https://127.0.0.1:2379/health
    kube-apiserver:
      name: kube-apiserver
      command:
      - /opt/rke/entrypoint.sh
      - kube-apiserver
      - --insecure-port=0
      - --kubelet-client-key=/etc/kubernetes/ssl/kube-apiserver-key.pem
      - --insecure-bind-address=127.0.0.1
      - --bind-address=127.0.0.1
      - --secure-port=6443
      - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      - --service-account-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem
      - --cloud-provider=
      - --service-cluster-ip-range=10.43.0.0/16
      - --tls-cert-file=/etc/kubernetes/ssl/kube-apiserver.pem
      - --tls-private-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem
      - --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem
      - --authorization-mode=Node,RBAC
      - --allow-privileged=true
      - --admission-control=ServiceAccount,NamespaceLifecycle,LimitRanger,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota,DefaultTolerationSeconds
      - --storage-backend=etcd3
      - --client-ca-file=/etc/kubernetes/ssl/kube-ca.pem
      - --advertise-address=10.43.0.1
      args:
      - --etcd-cafile=/etc/kubernetes/ssl/kube-ca.pem
      - --etcd-certfile=/etc/kubernetes/ssl/kube-node.pem
      - --etcd-keyfile=/etc/kubernetes/ssl/kube-node-key.pem
      - --etcd-servers=https://127.0.0.1:2379
      - --etcd-prefix=/registry
      env: []
      image: rancher/server:dev
      imageregistryauthconfig: ""
      volumesfrom:
      - service-sidekick
      binds:
      - /etc/kubernetes:/etc/kubernetes:z
      networkmode: host
      restartpolicy: always
      pidmode: ""
      privileged: false
      healthcheck:
        url: https://localhost:6443/healthz
    kube-controller-manager:
      name: kube-controller-manager
      command:
      - /opt/rke/entrypoint.sh
      - kube-controller-manager
      - --allow-untagged-cloud=true
      - --v=2
      - --allocate-node-cidrs=true
      - --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-controller-manager.yaml
      - --service-account-private-key-file=/etc/kubernetes/ssl/kube-apiserver-key.pem
      - --address=0.0.0.0
      - --leader-elect=true
      - --cloud-provider=
      - --node-monitor-grace-period=40s
      - --pod-eviction-timeout=5m0s
      - --service-cluster-ip-range=10.43.0.0/16
      - --root-ca-file=/etc/kubernetes/ssl/kube-ca.pem
      - --configure-cloud-routes=false
      - --enable-hostpath-provisioner=false
      - --cluster-cidr=10.42.0.0/16
      args:
      - --use-service-account-credentials=true
      env: []
      image: rancher/server:dev
      imageregistryauthconfig: ""
      volumesfrom:
      - service-sidekick
      binds:
      - /etc/kubernetes:/etc/kubernetes:z
      networkmode: host
      restartpolicy: always
      pidmode: ""
      privileged: false
      healthcheck:
        url: http://localhost:10252/healthz
    kube-proxy:
      name: kube-proxy
      command:
      - /opt/rke/entrypoint.sh
      - kube-proxy
      - --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-proxy.yaml
      - --v=2
      - --healthz-bind-address=0.0.0.0
      args: []
      env: []
      image: rancher/server:dev
      imageregistryauthconfig: ""
      volumesfrom:
      - service-sidekick
      binds:
      - /etc/kubernetes:/etc/kubernetes:z
      networkmode: host
      restartpolicy: always
      pidmode: host
      privileged: true
      healthcheck:
        url: http://localhost:10256/healthz
    kube-scheduler:
      name: kube-scheduler
      command:
      - /opt/rke/entrypoint.sh
      - kube-scheduler
      - --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-scheduler.yaml
      - --leader-elect=true
      - --v=2
      - --address=0.0.0.0
      args: []
      env: []
      image: rancher/server:dev
      imageregistryauthconfig: ""
      volumesfrom:
      - service-sidekick
      binds:
      - /etc/kubernetes:/etc/kubernetes:z
      networkmode: host
      restartpolicy: always
      pidmode: ""
      privileged: false
      healthcheck:
        url: http://localhost:10251/healthz
    kubelet:
      name: kubelet
      command:
      - /opt/rke/entrypoint.sh
      - kubelet
      - --address=0.0.0.0
      - --cadvisor-port=0
      - --enforce-node-allocatable=
      - --network-plugin=cni
      - --cluster-dns=10.43.0.10
      - --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-node.yaml
      - --v=2
      - --cni-conf-dir=/etc/cni/net.d
      - --resolv-conf=/etc/resolv.conf
      - --volume-plugin-dir=/var/lib/kubelet/volumeplugins
      - --read-only-port=0
      - --cni-bin-dir=/opt/cni/bin
      - --allow-privileged=true
      - --pod-infra-container-image=rancher/pause-amd64:3.0
      - --client-ca-file=/etc/kubernetes/ssl/kube-ca.pem
      - --fail-swap-on=false
      - --cgroups-per-qos=True
      - --anonymous-auth=false
      - --cluster-domain=cluster.local
      - --hostname-override=master
      - --cloud-provider=
      args: []
      env: []
      image: rancher/server:dev
      imageregistryauthconfig: ""
      volumesfrom:
      - service-sidekick
      binds:
      - /etc/kubernetes:/etc/kubernetes:z
      - /etc/cni:/etc/cni:ro,z
      - /opt/cni:/opt/cni:ro,z
      - /var/lib/cni:/var/lib/cni:z
      - /etc/resolv.conf:/etc/resolv.conf
      - /sys:/sys:rprivate
      - /var/lib/docker:/var/lib/docker:rw,rprivate,z
      - /var/lib/kubelet:/var/lib/kubelet:shared,z
      - /var/run:/var/run:rw,rprivate
      - /run:/run:rprivate
      - /etc/ceph:/etc/ceph
      - /dev:/host/dev:rprivate
      - /var/log/containers:/var/log/containers:z
      - /var/log/pods:/var/log/pods:z
      networkmode: host
      restartpolicy: always
      pidmode: host
      privileged: true
      healthcheck:
        url: https://localhost:10250/healthz
    service-sidekick:
      name: service-sidekick
      command: []
      args: []
      env: []
      image: rancher/rke-service-sidekick:v0.1.2
      imageregistryauthconfig: ""
      volumesfrom: []
      binds: []
      networkmode: none
      restartpolicy: ""
      pidmode: ""
      privileged: false
      healthcheck:
        url: ""
  portchecks:
  - address: 127.0.0.1
    port: 10250
    protocol: TCP
  - address: 127.0.0.1
    port: 6443
    protocol: TCP
  - address: 127.0.0.1
    port: 2379
    protocol: TCP
  - address: 127.0.0.1
    port: 2380
    protocol: TCP
  files:
  - name: /etc/kubernetes/cloud-config.json
    contents: ""
  annotations:
    rke.io/external-ip: 127.0.0.1
    rke.io/internal-ip: 127.0.0.1
  labels:
    node-role.kubernetes.io/controlplane: "true"
    node-role.kubernetes.io/etcd: "true"

        建立好這個內建的kubernetes集羣后,就要進行app.Run()方法了,也就是真正啓動rancher server了,這裏只關注CRD資源的建立和使用。先看一下都有哪些自定義資源,首先須要有訪問集羣的config文件,這個文件位於rancher源代碼路徑下,即github.com/rancher/rancher/kube_config_cluster.yml文件,有了這個文件就能夠經過kubectl訪問集羣獲取相應信息,內容以下:docker

[root@localhost rancher]# kubectl get crd
NAME                                                            AGE
apps.project.cattle.io                                          34d
authconfigs.management.cattle.io                                34d
catalogs.management.cattle.io                                   34d
clusteralerts.management.cattle.io                              34d
clustercomposeconfigs.management.cattle.io                      34d
clusterevents.management.cattle.io                              34d
clusterloggings.management.cattle.io                            34d
clusterpipelines.management.cattle.io                           34d
clusterregistrationtokens.management.cattle.io                  34d
clusterroletemplatebindings.management.cattle.io                34d
clusters.management.cattle.io                                   34d
dynamicschemas.management.cattle.io                             34d
globalcomposeconfigs.management.cattle.io                       34d
globalrolebindings.management.cattle.io                         34d
globalroles.management.cattle.io                                34d
groupmembers.management.cattle.io                               34d
groups.management.cattle.io                                     34d
listenconfigs.management.cattle.io                              34d
namespacecomposeconfigs.project.cattle.io                       34d
nodedrivers.management.cattle.io                                34d
nodepools.management.cattle.io                                  34d
nodes.management.cattle.io                                      34d
nodetemplates.management.cattle.io                              34d
notifiers.management.cattle.io                                  34d
pipelineexecutionlogs.management.cattle.io                      34d
pipelineexecutions.management.cattle.io                         34d
pipelines.management.cattle.io                                  34d
podsecuritypolicytemplateprojectbindings.management.cattle.io   15d
podsecuritypolicytemplates.management.cattle.io                 34d
preferences.management.cattle.io                                34d
projectalerts.management.cattle.io                              34d
projectloggings.management.cattle.io                            34d
projectnetworkpolicies.management.cattle.io                     34d
projectroletemplatebindings.management.cattle.io                34d
projects.management.cattle.io                                   34d
roletemplates.management.cattle.io                              34d
settings.management.cattle.io                                   34d
sourcecodecredentials.management.cattle.io                      34d
sourcecoderepositories.management.cattle.io                     34d
templates.management.cattle.io                                  34d
templateversions.management.cattle.io                           34d
tokens.management.cattle.io                                     34d
users.management.cattle.io                                      34d

        能夠看到定義了不少CRD資源,這些資源再結合自定義的controller即實現了相應的業務邏輯。這裏再介紹一下kubernetes自定義controller的編程範式,以下圖所示:編程

 

        在自定義controller的時候,須要使用client-go工具。圖中藍色的部分是client-go中內容,即已經爲用戶提供的,不須要從新開發能夠直接使用,紅色的部分就是用戶須要完成的業務邏輯。informer會跟蹤CRD資源的變化,一旦觸發就會調用Callbacks,並把關心的變動的object放到Workqueue中,Worker會get到Workqueue中的內容進行相應的業務處理。json

        接下來看Rancher中這些CRD資源的controller是如何建立的,在main.go中調用的Run()方法中會先構建一個scaledContext,由scaledContext.Start()方法進行controller的建立啓動,代碼以下:api

func Run(ctx context.Context, kubeConfig rest.Config, cfg *Config) error {
    if err := service.Start(); err != nil {
        return err
    }

    scaledContext, clusterManager, err := buildScaledContext(ctx, kubeConfig, cfg) if err != nil {
        return err
    }

    if err := server.Start(ctx, cfg.HTTPListenPort, cfg.HTTPSListenPort, scaledContext, clusterManager); err != nil {
        return err
    }

    if err := scaledContext.Start(ctx); err != nil {
        return err
    }

......

        scaledContext.Start()方法的代碼以下:app

func (c *ScaledContext) Start(ctx context.Context) error {
    logrus.Info("Starting API controllers")
    return controller.SyncThenStart(ctx, 5, c.controllers()...)
}

        這裏的c.controllers()會進行接口賦值,即將ScaledContext結構體中的Management、Project、RBAC和Core接口賦值給controller包中Starter接口,代碼以下:dom

type ScaledContext struct {
    ClientGetter      proxy.ClientGetter
    LocalConfig       *rest.Config
    RESTConfig        rest.Config
    UnversionedClient rest.Interface
    K8sClient         kubernetes.Interface
    APIExtClient      clientset.Interface
    Schemas           *types.Schemas
    AccessControl     types.AccessControl
    Dialer            dialer.Factory
    UserManager       user.Manager
    Leader            bool Management managementv3.Interface Project projectv3.Interface RBAC rbacv1.Interface Core corev1.Interface
}

func (c *ScaledContext) controllers() []controller.Starter {
    return []controller.Starter{ c.Management, c.Project, c.RBAC, c.Core,
    }
}

        經過調用controller包中Starter接口的Sync和Start方法來完成自定義controller的建立和啓動。代碼以下:

type Starter interface {
 Sync(ctx context.Context) error Start(ctx context.Context, threadiness int) error
}

func SyncThenStart(ctx context.Context, threadiness int, starters ...Starter) error {
    if err := Sync(ctx, starters...); err != nil {
        return err
    }
    return Start(ctx, threadiness, starters...)
}

func Sync(ctx context.Context, starters ...Starter) error {
    eg, _ := errgroup.WithContext(ctx)
    for _, starter := range starters {
        func(starter Starter) {
            eg.Go(func() error {
                return starter.Sync(ctx)
            })
        }(starter)
    }
    return eg.Wait()
}

func Start(ctx context.Context, threadiness int, starters ...Starter) error {
    for _, starter := range starters {
        if err := starter.Start(ctx, threadiness); err != nil {
            return err
        }
    }
    return nil
}

        在此前的buildScaledContext()方法中,賦值了Management、Project等接口的實現,而這些接口中又包含各自所須要的controller的實現,這裏再也不詳述了。以Management中的cluster controller爲例,其建立的過程以下圖所示:

         最終,由controller包中generic_controller.go的genericController結構體實現了Sync和Start方法,另外,在NewGenericController實現了此前提到的informer和workqueue,代碼以下:

func NewGenericController(name string, genericClient Backend) GenericController {
    informer := cache.NewSharedIndexInformer(
        &cache.ListWatch{
            ListFunc:  genericClient.List,
            WatchFunc: genericClient.Watch,
        },
        genericClient.ObjectFactory().Object(), resyncPeriod, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc})

    rl := workqueue.NewMaxOfRateLimiter(
        workqueue.NewItemExponentialFailureRateLimiter(500*time.Millisecond, 1000*time.Second),
        // 10 qps, 100 bucket size.  This is only for retry speed and its only the overall factor (not per item)
        &workqueue.BucketRateLimiter{Bucket: ratelimit.NewBucketWithRate(float64(10), int64(100))},
    )

    return &genericController{
        informer: informer,
        queue:    workqueue.NewNamedRateLimitingQueue(rl, name),
        name:     name,
    }
}

        Sync方法中定義了informer中的Callbacks,以下所示:

func (g *genericController) Sync(ctx context.Context) error {
    g.Lock()
    defer g.Unlock()

    return g.sync(ctx)
}

func (g *genericController) sync(ctx context.Context) error {
    if g.synced {
        return nil
    }

    defer utilruntime.HandleCrash()

 g.informer.AddEventHandler(cache.ResourceEventHandlerFuncs{ AddFunc: g.queueObject, UpdateFunc: func(_, obj interface{}) { g.queueObject(obj) }, DeleteFunc: g.queueObject, })

    logrus.Infof("Syncing %s Controller", g.name)

    go g.informer.Run(ctx.Done())

    if !cache.WaitForCacheSync(ctx.Done(), g.informer.HasSynced) {
        return fmt.Errorf("failed to sync controller %s", g.name)
    }
    logrus.Infof("Syncing %s Controller Done", g.name)

    g.synced = true
    return nil
}

        Start方法中,包含了worker的內容,即調用了controller所須要的處理邏輯,代碼片斷展現以下:

......
func (g *genericController) processNextWorkItem() bool {
    key, quit := g.queue.Get()
    if quit {
        return false
    }
    defer g.queue.Done(key)

    // do your work on the key.  This method will contains your "do stuff" logic
    err := g.syncHandler(key.(string))
    checkErr := err
    if handlerErr, ok := checkErr.(*handlerError); ok {
        checkErr = handlerErr.err
    }
    if _, ok := checkErr.(*ForgetError); err == nil || ok {
        if ok {
            logrus.Infof("%v %v completed with dropped err: %v", g.name, key, err)
        }
        g.queue.Forget(key)
        return true
    }

    if err := filterConflictsError(err); err != nil {
        utilruntime.HandleError(fmt.Errorf("%v %v %v", g.name, key, err))
    }

    g.queue.AddRateLimited(key)

    return true
}
......

 小結

        Kubernetes已經成爲了容器編排的事實標準,但還不足以構成一個PAAS平臺,須要在其之上進行多方面的的擴展,Rancher正是一種很好的實現,其包含了多cluster和project的統一管理,CI/CD,及基於Helm的應用商店,另外還有權限、監控、日誌等的管理。本文主要是從代碼層面簡要學習和分析Rancher是如何基於Kubernetes進行擴展的,並結合Kubernetes controller的編程範式介紹其實現機制。

相關文章
相關標籤/搜索