Kubernetes 新玩法：在 yaml 中編程

時間 2020-09-29

標籤 kubernetes 玩法 yaml 編程简体版

原文原文鏈接

簡介： 如何作性能測試？要麼是經過編碼的方式完成，寫一堆腳本，用完即棄；要麼是基於平臺，在平臺定義的流程中進行。對於後者，一般因爲目標場景的複雜性，如部署特定的 workload、觀測特定的性能項、網絡訪問問題等，每每致使性能測試平臺要以高成本才能知足不斷變化的開發場景的需求。在雲原生的背景下，是否能夠更好解決這種問題？html

做者 | 悟鵬nginx

引子

性能測試在平常的開發工做中是常規需求，用來摸底服務的性能。git

那麼如何作性能測試？要麼是經過編碼的方式完成，寫一堆腳本，用完即棄；要麼是基於平臺，在平臺定義的流程中進行。對於後者，一般因爲目標場景的複雜性，如部署特定的 workload、觀測特定的性能項、網絡訪問問題等，每每致使性能測試平臺要以高成本才能知足不斷變化的開發場景的需求。github

在雲原生的背景下，是否能夠更好解決這種問題？編程

先看兩個 yaml 文件：json

performance-test.yaml 描述了在 K8s 中的操做流程：api
1. 建立測試用的 Namespace
2. 啓動針對 Deployment 建立效率和建立成功率的監控
3. 下述動做重複 N 次：① 使用 workload 模板建立 Deployment；② 等待 Deployment 變爲 Ready
4. 刪除測試用的 Namespace
basic-1-pod-deployment.yaml 描述使用的 workload 模板

performance-test.yaml ：網絡

apiVersion: aliyun.com/v1alpha1
kind: Beidou
metadata:
  name: performance
  namespace: beidou
spec:
  steps:
  - name: "Create Namespace If Not Exits"
    operations:
    - name: "create namespace"
      type: Task
      op: CreateNamespace
      args:
      - name: NS
        value: beidou
  - name: "Monitor Deployment Creation Efficiency"
    operations:
    - name: "Begin To Monitor Deployment Creation Efficiency"
      type: Task
      op: DeploymentCreationEfficiency
      args:
      - name: NS
        value: beidou
    - name: "Repeat 1 Times"
      type: Task
      op: RepeatNTimes
      args:
      - name: TIMES
        value: "1"
      - name: ACTION
        reference:
          id: deployment-operation
  - name: "Delete namespace"
    operations:
    - name: "delete namespace"
      type: Task
      op: DeleteNamespace
      args:
      - name: NS
        value: beidou
      - name: FORCE
        value: "false"
  references:
  - id: deployment-operation
    steps:
    - name: "Prepare Deployment"
      operations:
      - name: "Prepare Deployment"
        type: Task
        op: PrepareBatchDeployments
        args:
        - name: NS
          value: beidou
        - name: NODE_TYPE
          value: ebm
        - name: BATCH_NUM
          value: "1"
        - name: TEMPLATE
          value: "./templates/basic-1-pod-deployment.yaml"
        - name: DEPLOYMENT_REPLICAS
          value: "1"
        - name: DEPLOYMENT_PREFIX
          value: "ebm"
      - name: "Wait For Deployments To Be Ready"
        type: Task
        op: WaitForBatchDeploymentsReady
        args:
        - name: NS
          value: beidou
        - name: TIMEOUT
          value: "3m"
        - name: CHECK_INTERVAL
          value: "2s"

basic-1-pod-deployment.yaml：app

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: basic-1-pod
spec:
  selector:
    matchLabels:
      app: basic-1-pod
  template:
    metadata:
      labels:
        app: basic-1-pod
    spec:
      containers:
      - name: nginx
        image: registry-vpc.cn-hangzhou.aliyuncs.com/xxx/nginx:1.17.9
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 2
            memory: 4Gi

而後經過一個命令行工具執行 performance-test.yaml：運維

$ beidou server -c ~/.kube/config services/performance-test.yaml

執行效果以下 (每一個 Deployment 建立耗時，全部 Deployment 建立耗時的 TP95 值，每一個 Deployment 是否建立成功)：

這些 metrics 是按照 Prometheus 標準輸出，能夠被 Prometheus server 收集走，再結合 Grafana 能夠可視化展現性能測試數據。

經過在 yaml 中表達想法，編排對 K8s 資源的操做、監控，不再用爲性能測試的實現頭疼了 :D

爲何要在 yaml 中編程？

性能測試、迴歸測試等對於服務質量保障有很大幫助，須要作，但常規的實現方法在初期須要投入較多的時間和精力，新增變動後維護成本比較高。

一般這個過程是以代碼的方式實現原子操做，如建立 Deployment、檢測 Pod 配置等，而後再組合原子操做來知足需求，如建立 Deployment -> 等待 Deployment ready -> 檢測 Pod 配置等。

有沒有辦法在實現的過程當中既能夠儘可能低成本實現，又能夠複用已有的經驗？

能夠將原子操做封裝爲原語，如 CreateDeployment、CheckPod，再經過 yaml 的結構表達流程，那麼就能夠經過 yaml 而非代碼的方式描述想法，又能夠複用他人已經寫好的 yaml 文件來解決某類場景的需求。

即在 yaml 中編程，減小重複性代碼工做，經過 聲明式 的方式描述邏輯，並以 yaml 文件來知足場景級別的複用。

業界有不少種類型的 聲明式操做 服務，如運維領域中的 Ansible、SaltStack，Kubernetes 中的Argo Workflow、clusterloader2。它們的思想總體比較相似，將高頻使用的操做封裝爲原語，使用者經過原語來表述操做邏輯。

經過聲明式的方法，將面向 K8s 的操做抽象成 yaml 中的關鍵詞，在 yaml 中提供串行、並行等控制邏輯，那麼就能夠經過 yaml 文件完整描述想要進行的工做。

這種思想和 Argo Workflow 比較像，但粒度比 Argo 更細，關注在操做函數上：

下面簡單描述該服務的設計和實現。

設計和實現

服務形態

使用者在 yaml 中，經過 聲明式 的方式描述操做邏輯；
以 all-in-one 的二進制工具或 Operator 的方式交付；
服務內置常見原語的實現，以關鍵字的方式在 yaml 中提供；
支持配置原生 K8s 資源。

設計

該方案的核心在於配置管理的設計，將操做流程配置化，自上而下有以下概念：

Service：Modules 或 Tasks 的編排；
Module：一種任務場景，是操做單元的集合（其中包含 templates/ 目錄，表徵模板文件的集合，可用來配置 K8s 原生資源）；
Task：操做單元，使用 plugin 及參數執行操做；
Plugin：操做指令，相似開發語言中的函數。

抽象目標場景中的通用操做，這些通用操做即爲可在 yaml 中使用的原語，對應上述 Plugin：

K8s 相關
- CreateNamespace
- DeleteNamespace
- PrepareSecret
- PrepareConfigMap
- PrepareBatchDeployments
- WaitForBatchDeploymentsReady
- etc.
觀測性相關
- DeploymentCreationEfficiency
- PodCreationEfficiency
- etc.
檢測項相關
- CheckPodAnnotations
- CheckPodObjectInfo
- CheckPodInnerStates
- etc.
控制語句相關
- RepeatNTimes
- etc.

上述 4 個概念的關係以下：

示例可參見文章開頭的 yaml 文件，對應形式二。

核心實現

CRD 設計：

package v1alpha1

import (
    corev1 "k8s.io/api/core/v1"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

_// BeidouType is the type related to Beidou execution._
type BeidouType string

const (
    _// BeidouTask represents the Task execution type._
    BeidouTask BeidouType = "Task"
)

_// +genclient_
_// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object_

_// Beidou represents a crd used to describe serices._
type Beidou struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`

    Spec   BeidouSpec   `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
    Status BeidouStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
}

_// BeidouSpec is the spec of a Beidou._
type BeidouSpec struct {
    Steps      []BeidouStep      `json:"steps" protobuf:"bytes,1,opt,name=steps"`
    References []BeidouReference `json:"references" protobuf:"bytes,2,opt,name=references"`
}

_// BeidouStep is the spec of step._
type BeidouStep struct {
    Name       string            `json:"name" protobuf:"bytes,1,opt,name=name"`
    Operations []BeidouOperation `json:"operations" protobuf:"bytes,2,opt,name=operations"`
}

_// BeidouOperation is the spec of operation._
type BeidouOperation struct {
    Name string      `json:"name" protobuf:"bytes,1,opt,name=name"`
    Type BeidouType  `json:"type" protobuf:"bytes,2,opt,name=type"`
    Op   string      `json:"op" protobuf:"bytes,3,opt,name=op"`
    Args []BeidouArg `json:"args" protobuf:"bytes,4,opt,name=args"`
}

_// BeidouArg is the spec of arg._
type BeidouArg struct {
    Name        string                   `json:"name" protobuf:"bytes,1,opt,name=name"`
    Value       string                   `json:"value,omitempty" protobuf:"bytes,2,opt,name=value"`
    Reference   BeidouOperationReference `json:"reference,omitempty" protobuf:"bytes,3,opt,name=reference"`
    Tolerations []corev1.Toleration      `json:"tolerations,omitempty" protobuf:"bytes,4,opt,name=tolerations"`
    Checking    []string                 `json:"checking,omitempty" protobuf:"bytes,5,opt,name=checking"`
}

_// BeidouOperationReference is the spec of operation reference._
type BeidouOperationReference struct {
    ID string `json:"id" protobuf:"bytes,1,opt,name=id"`
}

_// BeidouReference is the spec of reference._
type BeidouReference struct {
    ID    string       `json:"id" protobuf:"bytes,1,opt,name=id"`
    Steps []BeidouStep `json:"steps" protobuf:"bytes,2,opt,name=steps"`
}

_// BeidouStatus represents the current state of a Beidou._
type BeidouStatus struct {
    Message string `json:"message" protobuf:"bytes,1,opt,name=message"`
}

_// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object_

_// BeidouList is a collection of Beidou._
type BeidouList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata" protobuf:"bytes,1,opt,name=metadata"`

    Items []Beidou `json:"items" protobuf:"bytes,2,opt,name=items"`
}

核心流程：

_// ExecSteps executes steps._
func ExecSteps(ctx context.Context, steps []v1alpha1.BeidouStep, references []v1alpha1.BeidouReference) error {
    logger, _ := ctx.Value(CtxLogger).(*log.Entry)

    var hasMonitored bool
    for i, step := range steps {
        for j, op := range step.Operations {
            switch op.Op {
            case "DeploymentCreationEfficiency":
                if !hasMonitored {
                    defer func() {
                        err := monitor.Output()
                        if err != nil {
                            logger.Errorf("Failed to output: %s", err)
                        }
                    }()
                }
                hasMonitored = true
            }

            err := ExecOperation(ctx, op, references)
            if err != nil {
                return fmt.Errorf("failed to run operation %s: %s", op.Name, err)
            }
        }
    }

    return nil
}

_// ExecOperation executes operation._
func ExecOperation(ctx context.Context, op v1alpha1.BeidouOperation, references []v1alpha1.BeidouReference) error {
    switch op.Type {
    case v1alpha1.BeidouTask:
        if !tasks.IsRegistered(op.Op) {
            return ErrNotRegistered
        }

        if !tasks.DoesSupportReference(op.Op) {
            return ExecTask(ctx, op.Op, op.Args)
        }

        return ExecTaskWithRefer(ctx, op.Op, op.Args, references)
    }

    return nil
}

_// ExecTask executes a task._
func ExecTask(ctx context.Context, opname string, args []v1alpha1.BeidouArg) error {
    switch opname {
    case tasks.CreateNamespace:
        var ns string
        for _, arg := range args {
            switch arg.Name {
            case "NS":
                ns = arg.Value
            }
        }

        return op.CreateNamespace(ctx, ns)
    _// ..._
    }
    _// ..._
}

_// ExecTaskWithRefer executes a task with reference._
func ExecTaskWithRefer(ctx context.Context, opname string, args []v1alpha1.BeidouArg, references []v1alpha1.BeidouReference) error {
    switch opname {
    case tasks.RepeatNTimes:
        var times int
        var steps []v1alpha1.BeidouStep
        var err error
        for _, arg := range args {
            switch arg.Name {
            case "TIMES":
                times, err = strconv.Atoi(arg.Value)
                if err != nil {
                    return ErrParseArgs
                }
            case "ACTION":
                for _, refer := range references {
                    if refer.ID == arg.Reference.ID {
                        steps = refer.Steps
                        break
                    }
                }
            }
        }

        return RepeatNTimes(ctx, times, steps)
    }

    return ErrNotImplemented
}

操做原語的實現示例：

// PodAnnotations is an operation used to check whether annotations of Pod are expected.
func PodAnnotations(ctx context.Context, data PodAnnotationsData) error {
    kclient, ok := ctx.Value(tasks.KubernetesClient).(kubernetes.Interface)
    if !ok {
        return tasks.ErrNoKubernetesClient
    }

    pods, err := kclient.CoreV1().Pods(data.Namespace).List(metav1.ListOptions{})
    if err != nil {
        return fmt.Errorf("failed to list pods in ns %s: %s", data.Namespace, err)
    }

    for _, pod := range pods.Items {
        if pod.Annotations == nil {
            return fmt.Errorf("pod %s in ns %s has no annotations", pod.Name, data.Namespace)
        }

        for _, annotation := range data.Exists {
            if _, exists := pod.Annotations[annotation]; !exists {
                return fmt.Errorf("annotation %s does not exist in pod %s in ns %s", annotation, pod.Name, data.Namespace)
            }
        }

        for k, v := range data.Equal {
            if pod.Annotations[k] != v {
                return fmt.Errorf("value of annotation %s is not %s in pod %s in ns %s", k, v, pod.Name, data.Namespace)
            }
        }
    }

    return nil
}

做者 | 悟鵬

引子

性能測試在平常的開發工做中是常規需求，用來摸底服務的性能。

那麼如何作性能測試？要麼是經過編碼的方式完成，寫一堆腳本，用完即棄；要麼是基於平臺，在平臺定義的流程中進行。對於後者，一般因爲目標場景的複雜性，如部署特定的 workload、觀測特定的性能項、網絡訪問問題等，每每致使性能測試平臺要以高成本才能知足不斷變化的開發場景的需求。

在雲原生的背景下，是否能夠更好解決這種問題？

先看兩個 yaml 文件：

performance-test.yaml 描述了在 K8s 中的操做流程：
1. 建立測試用的 Namespace
2. 啓動針對 Deployment 建立效率和建立成功率的監控
3. 下述動做重複 N 次：① 使用 workload 模板建立 Deployment；② 等待 Deployment 變爲 Ready
4. 刪除測試用的 Namespace
basic-1-pod-deployment.yaml 描述使用的 workload 模板

performance-test.yaml ：

apiVersion: aliyun.com/v1alpha1
kind: Beidou
metadata:
  name: performance
  namespace: beidou
spec:
  steps:
  - name: "Create Namespace If Not Exits"
    operations:
    - name: "create namespace"
      type: Task
      op: CreateNamespace
      args:
      - name: NS
        value: beidou
  - name: "Monitor Deployment Creation Efficiency"
    operations:
    - name: "Begin To Monitor Deployment Creation Efficiency"
      type: Task
      op: DeploymentCreationEfficiency
      args:
      - name: NS
        value: beidou
    - name: "Repeat 1 Times"
      type: Task
      op: RepeatNTimes
      args:
      - name: TIMES
        value: "1"
      - name: ACTION
        reference:
          id: deployment-operation
  - name: "Delete namespace"
    operations:
    - name: "delete namespace"
      type: Task
      op: DeleteNamespace
      args:
      - name: NS
        value: beidou
      - name: FORCE
        value: "false"
  references:
  - id: deployment-operation
    steps:
    - name: "Prepare Deployment"
      operations:
      - name: "Prepare Deployment"
        type: Task
        op: PrepareBatchDeployments
        args:
        - name: NS
          value: beidou
        - name: NODE_TYPE
          value: ebm
        - name: BATCH_NUM
          value: "1"
        - name: TEMPLATE
          value: "./templates/basic-1-pod-deployment.yaml"
        - name: DEPLOYMENT_REPLICAS
          value: "1"
        - name: DEPLOYMENT_PREFIX
          value: "ebm"
      - name: "Wait For Deployments To Be Ready"
        type: Task
        op: WaitForBatchDeploymentsReady
        args:
        - name: NS
          value: beidou
        - name: TIMEOUT
          value: "3m"
        - name: CHECK_INTERVAL
          value: "2s"

basic-1-pod-deployment.yaml：

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: basic-1-pod
spec:
  selector:
    matchLabels:
      app: basic-1-pod
  template:
    metadata:
      labels:
        app: basic-1-pod
    spec:
      containers:
      - name: nginx
        image: registry-vpc.cn-hangzhou.aliyuncs.com/xxx/nginx:1.17.9
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 2
            memory: 4Gi

而後經過一個命令行工具執行 performance-test.yaml：

$ beidou server -c ~/.kube/config services/performance-test.yaml

執行效果以下 (每一個 Deployment 建立耗時，全部 Deployment 建立耗時的 TP95 值，每一個 Deployment 是否建立成功)：

這些 metrics 是按照 Prometheus 標準輸出，能夠被 Prometheus server 收集走，再結合 Grafana 能夠可視化展現性能測試數據。

經過在 yaml 中表達想法，編排對 K8s 資源的操做、監控，不再用爲性能測試的實現頭疼了 :D

爲何要在 yaml 中編程？

性能測試、迴歸測試等對於服務質量保障有很大幫助，須要作，但常規的實現方法在初期須要投入較多的時間和精力，新增變動後維護成本比較高。

有沒有辦法在實現的過程當中既能夠儘可能低成本實現，又能夠複用已有的經驗？

即在 yaml 中編程，減小重複性代碼工做，經過 聲明式 的方式描述邏輯，並以 yaml 文件來知足場景級別的複用。

這種思想和 Argo Workflow 比較像，但粒度比 Argo 更細，關注在操做函數上：

下面簡單描述該服務的設計和實現。

設計和實現

服務形態

使用者在 yaml 中，經過 聲明式 的方式描述操做邏輯；
以 all-in-one 的二進制工具或 Operator 的方式交付；
服務內置常見原語的實現，以關鍵字的方式在 yaml 中提供；
支持配置原生 K8s 資源。

設計

該方案的核心在於配置管理的設計，將操做流程配置化，自上而下有以下概念：

Service：Modules 或 Tasks 的編排；
Module：一種任務場景，是操做單元的集合（其中包含 templates/ 目錄，表徵模板文件的集合，可用來配置 K8s 原生資源）；
Task：操做單元，使用 plugin 及參數執行操做；
Plugin：操做指令，相似開發語言中的函數。

抽象目標場景中的通用操做，這些通用操做即爲可在 yaml 中使用的原語，對應上述 Plugin：

K8s 相關
- CreateNamespace
- DeleteNamespace
- PrepareSecret
- PrepareConfigMap
- PrepareBatchDeployments
- WaitForBatchDeploymentsReady
- etc.
觀測性相關
- DeploymentCreationEfficiency
- PodCreationEfficiency
- etc.
檢測項相關
- CheckPodAnnotations
- CheckPodObjectInfo
- CheckPodInnerStates
- etc.
控制語句相關
- RepeatNTimes
- etc.

上述 4 個概念的關係以下：

示例可參見文章開頭的 yaml 文件，對應形式二。

核心實現

CRD 設計：

package v1alpha1

import (
    corev1 "k8s.io/api/core/v1"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

_// BeidouType is the type related to Beidou execution._
type BeidouType string

const (
    _// BeidouTask represents the Task execution type._
    BeidouTask BeidouType = "Task"
)

_// +genclient_
_// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object_

_// Beidou represents a crd used to describe serices._
type Beidou struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`

    Spec   BeidouSpec   `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
    Status BeidouStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
}

_// BeidouSpec is the spec of a Beidou._
type BeidouSpec struct {
    Steps      []BeidouStep      `json:"steps" protobuf:"bytes,1,opt,name=steps"`
    References []BeidouReference `json:"references" protobuf:"bytes,2,opt,name=references"`
}

_// BeidouStep is the spec of step._
type BeidouStep struct {
    Name       string            `json:"name" protobuf:"bytes,1,opt,name=name"`
    Operations []BeidouOperation `json:"operations" protobuf:"bytes,2,opt,name=operations"`
}

_// BeidouOperation is the spec of operation._
type BeidouOperation struct {
    Name string      `json:"name" protobuf:"bytes,1,opt,name=name"`
    Type BeidouType  `json:"type" protobuf:"bytes,2,opt,name=type"`
    Op   string      `json:"op" protobuf:"bytes,3,opt,name=op"`
    Args []BeidouArg `json:"args" protobuf:"bytes,4,opt,name=args"`
}

_// BeidouArg is the spec of arg._
type BeidouArg struct {
    Name        string                   `json:"name" protobuf:"bytes,1,opt,name=name"`
    Value       string                   `json:"value,omitempty" protobuf:"bytes,2,opt,name=value"`
    Reference   BeidouOperationReference `json:"reference,omitempty" protobuf:"bytes,3,opt,name=reference"`
    Tolerations []corev1.Toleration      `json:"tolerations,omitempty" protobuf:"bytes,4,opt,name=tolerations"`
    Checking    []string                 `json:"checking,omitempty" protobuf:"bytes,5,opt,name=checking"`
}

_// BeidouOperationReference is the spec of operation reference._
type BeidouOperationReference struct {
    ID string `json:"id" protobuf:"bytes,1,opt,name=id"`
}

_// BeidouReference is the spec of reference._
type BeidouReference struct {
    ID    string       `json:"id" protobuf:"bytes,1,opt,name=id"`
    Steps []BeidouStep `json:"steps" protobuf:"bytes,2,opt,name=steps"`
}

_// BeidouStatus represents the current state of a Beidou._
type BeidouStatus struct {
    Message string `json:"message" protobuf:"bytes,1,opt,name=message"`
}

_// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object_

_// BeidouList is a collection of Beidou._
type BeidouList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata" protobuf:"bytes,1,opt,name=metadata"`

    Items []Beidou `json:"items" protobuf:"bytes,2,opt,name=items"`
}

核心流程：

_// ExecSteps executes steps._
func ExecSteps(ctx context.Context, steps []v1alpha1.BeidouStep, references []v1alpha1.BeidouReference) error {
    logger, _ := ctx.Value(CtxLogger).(*log.Entry)

    var hasMonitored bool
    for i, step := range steps {
        for j, op := range step.Operations {
            switch op.Op {
            case "DeploymentCreationEfficiency":
                if !hasMonitored {
                    defer func() {
                        err := monitor.Output()
                        if err != nil {
                            logger.Errorf("Failed to output: %s", err)
                        }
                    }()
                }
                hasMonitored = true
            }

            err := ExecOperation(ctx, op, references)
            if err != nil {
                return fmt.Errorf("failed to run operation %s: %s", op.Name, err)
            }
        }
    }

    return nil
}

_// ExecOperation executes operation._
func ExecOperation(ctx context.Context, op v1alpha1.BeidouOperation, references []v1alpha1.BeidouReference) error {
    switch op.Type {
    case v1alpha1.BeidouTask:
        if !tasks.IsRegistered(op.Op) {
            return ErrNotRegistered
        }

        if !tasks.DoesSupportReference(op.Op) {
            return ExecTask(ctx, op.Op, op.Args)
        }

        return ExecTaskWithRefer(ctx, op.Op, op.Args, references)
    }

    return nil
}

_// ExecTask executes a task._
func ExecTask(ctx context.Context, opname string, args []v1alpha1.BeidouArg) error {
    switch opname {
    case tasks.CreateNamespace:
        var ns string
        for _, arg := range args {
            switch arg.Name {
            case "NS":
                ns = arg.Value
            }
        }

        return op.CreateNamespace(ctx, ns)
    _// ..._
    }
    _// ..._
}

_// ExecTaskWithRefer executes a task with reference._
func ExecTaskWithRefer(ctx context.Context, opname string, args []v1alpha1.BeidouArg, references []v1alpha1.BeidouReference) error {
    switch opname {
    case tasks.RepeatNTimes:
        var times int
        var steps []v1alpha1.BeidouStep
        var err error
        for _, arg := range args {
            switch arg.Name {
            case "TIMES":
                times, err = strconv.Atoi(arg.Value)
                if err != nil {
                    return ErrParseArgs
                }
            case "ACTION":
                for _, refer := range references {
                    if refer.ID == arg.Reference.ID {
                        steps = refer.Steps
                        break
                    }
                }
            }
        }

        return RepeatNTimes(ctx, times, steps)
    }

    return ErrNotImplemented
}

操做原語的實現示例：

// PodAnnotations is an operation used to check whether annotations of Pod are expected.
func PodAnnotations(ctx context.Context, data PodAnnotationsData) error {
    kclient, ok := ctx.Value(tasks.KubernetesClient).(kubernetes.Interface)
    if !ok {
        return tasks.ErrNoKubernetesClient
    }

    pods, err := kclient.CoreV1().Pods(data.Namespace).List(metav1.ListOptions{})
    if err != nil {
        return fmt.Errorf("failed to list pods in ns %s: %s", data.Namespace, err)
    }

    for _, pod := range pods.Items {
        if pod.Annotations == nil {
            return fmt.Errorf("pod %s in ns %s has no annotations", pod.Name, data.Namespace)
        }

        for _, annotation := range data.Exists {
            if _, exists := pod.Annotations[annotation]; !exists {
                return fmt.Errorf("annotation %s does not exist in pod %s in ns %s", annotation, pod.Name, data.Namespace)
            }
        }

        for k, v := range data.Equal {
            if pod.Annotations[k] != v {
                return fmt.Errorf("value of annotation %s is not %s in pod %s in ns %s", k, v, pod.Name, data.Namespace)
            }
        }
    }

    return nil
}

後續

目前阿里雲容器服務團隊內部已經實現了第一版，已用於部分雲產品的內部性能測試以及常規的迴歸測試，很大程度上提高了咱們的工做效率。

在 yaml 中編程，是對雲原生場景下聲明式操做的體現，也是對聲明式服務的一種實踐。對於常規工做場景中重複編碼或重複簡介： 如何作性能測試？要麼是經過編碼的方式完成，寫一堆腳本，用完即棄；要麼是基於平臺，在平臺定義的流程中進行。對於後者，一般因爲目標場景的複雜性，如部署特定的 workload、觀測特定的性能項、網絡訪問問題等，每每致使性能測試平臺要以高成本才能知足不斷變化的開發場景的需求。在雲原生的背景下，是否能夠更好解決這種問題？

做者 | 悟鵬

引子

性能測試在平常的開發工做中是常規需求，用來摸底服務的性能。

在雲原生的背景下，是否能夠更好解決這種問題？

先看兩個 yaml 文件：

performance-test.yaml 描述了在 K8s 中的操做流程：
1. 建立測試用的 Namespace
2. 啓動針對 Deployment 建立效率和建立成功率的監控
3. 下述動做重複 N 次：① 使用 workload 模板建立 Deployment；② 等待 Deployment 變爲 Ready
4. 刪除測試用的 Namespace
basic-1-pod-deployment.yaml 描述使用的 workload 模板

performance-test.yaml ：

apiVersion: aliyun.com/v1alpha1
kind: Beidou
metadata:
  name: performance
  namespace: beidou
spec:
  steps:
  - name: "Create Namespace If Not Exits"
    operations:
    - name: "create namespace"
      type: Task
      op: CreateNamespace
      args:
      - name: NS
        value: beidou
  - name: "Monitor Deployment Creation Efficiency"
    operations:
    - name: "Begin To Monitor Deployment Creation Efficiency"
      type: Task
      op: DeploymentCreationEfficiency
      args:
      - name: NS
        value: beidou
    - name: "Repeat 1 Times"
      type: Task
      op: RepeatNTimes
      args:
      - name: TIMES
        value: "1"
      - name: ACTION
        reference:
          id: deployment-operation
  - name: "Delete namespace"
    operations:
    - name: "delete namespace"
      type: Task
      op: DeleteNamespace
      args:
      - name: NS
        value: beidou
      - name: FORCE
        value: "false"
  references:
  - id: deployment-operation
    steps:
    - name: "Prepare Deployment"
      operations:
      - name: "Prepare Deployment"
        type: Task
        op: PrepareBatchDeployments
        args:
        - name: NS
          value: beidou
        - name: NODE_TYPE
          value: ebm
        - name: BATCH_NUM
          value: "1"
        - name: TEMPLATE
          value: "./templates/basic-1-pod-deployment.yaml"
        - name: DEPLOYMENT_REPLICAS
          value: "1"
        - name: DEPLOYMENT_PREFIX
          value: "ebm"
      - name: "Wait For Deployments To Be Ready"
        type: Task
        op: WaitForBatchDeploymentsReady
        args:
        - name: NS
          value: beidou
        - name: TIMEOUT
          value: "3m"
        - name: CHECK_INTERVAL
          value: "2s"

basic-1-pod-deployment.yaml：

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: basic-1-pod
spec:
  selector:
    matchLabels:
      app: basic-1-pod
  template:
    metadata:
      labels:
        app: basic-1-pod
    spec:
      containers:
      - name: nginx
        image: registry-vpc.cn-hangzhou.aliyuncs.com/xxx/nginx:1.17.9
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 2
            memory: 4Gi

而後經過一個命令行工具執行 performance-test.yaml：

$ beidou server -c ~/.kube/config services/performance-test.yaml

執行效果以下 (每一個 Deployment 建立耗時，全部 Deployment 建立耗時的 TP95 值，每一個 Deployment 是否建立成功)：

這些 metrics 是按照 Prometheus 標準輸出，能夠被 Prometheus server 收集走，再結合 Grafana 能夠可視化展現性能測試數據。

經過在 yaml 中表達想法，編排對 K8s 資源的操做、監控，不再用爲性能測試的實現頭疼了 :D

爲何要在 yaml 中編程？

性能測試、迴歸測試等對於服務質量保障有很大幫助，須要作，但常規的實現方法在初期須要投入較多的時間和精力，新增變動後維護成本比較高。

有沒有辦法在實現的過程當中既能夠儘可能低成本實現，又能夠複用已有的經驗？

即在 yaml 中編程，減小重複性代碼工做，經過 聲明式 的方式描述邏輯，並以 yaml 文件來知足場景級別的複用。

這種思想和 Argo Workflow 比較像，但粒度比 Argo 更細，關注在操做函數上：

下面簡單描述該服務的設計和實現。

設計和實現

服務形態

使用者在 yaml 中，經過 聲明式 的方式描述操做邏輯；
以 all-in-one 的二進制工具或 Operator 的方式交付；
服務內置常見原語的實現，以關鍵字的方式在 yaml 中提供；
支持配置原生 K8s 資源。

設計

該方案的核心在於配置管理的設計，將操做流程配置化，自上而下有以下概念：

Service：Modules 或 Tasks 的編排；
Module：一種任務場景，是操做單元的集合（其中包含 templates/ 目錄，表徵模板文件的集合，可用來配置 K8s 原生資源）；
Task：操做單元，使用 plugin 及參數執行操做；
Plugin：操做指令，相似開發語言中的函數。

抽象目標場景中的通用操做，這些通用操做即爲可在 yaml 中使用的原語，對應上述 Plugin：

K8s 相關
- CreateNamespace
- DeleteNamespace
- PrepareSecret
- PrepareConfigMap
- PrepareBatchDeployments
- WaitForBatchDeploymentsReady
- etc.
觀測性相關
- DeploymentCreationEfficiency
- PodCreationEfficiency
- etc.
檢測項相關
- CheckPodAnnotations
- CheckPodObjectInfo
- CheckPodInnerStates
- etc.
控制語句相關
- RepeatNTimes
- etc.

上述 4 個概念的關係以下：

示例可參見文章開頭的 yaml 文件，對應形式二。

核心實現

CRD 設計：

package v1alpha1

import (
    corev1 "k8s.io/api/core/v1"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

_// BeidouType is the type related to Beidou execution._
type BeidouType string

const (
    _// BeidouTask represents the Task execution type._
    BeidouTask BeidouType = "Task"
)

_// +genclient_
_// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object_

_// Beidou represents a crd used to describe serices._
type Beidou struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`

    Spec   BeidouSpec   `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
    Status BeidouStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
}

_// BeidouSpec is the spec of a Beidou._
type BeidouSpec struct {
    Steps      []BeidouStep      `json:"steps" protobuf:"bytes,1,opt,name=steps"`
    References []BeidouReference `json:"references" protobuf:"bytes,2,opt,name=references"`
}

_// BeidouStep is the spec of step._
type BeidouStep struct {
    Name       string            `json:"name" protobuf:"bytes,1,opt,name=name"`
    Operations []BeidouOperation `json:"operations" protobuf:"bytes,2,opt,name=operations"`
}

_// BeidouOperation is the spec of operation._
type BeidouOperation struct {
    Name string      `json:"name" protobuf:"bytes,1,opt,name=name"`
    Type BeidouType  `json:"type" protobuf:"bytes,2,opt,name=type"`
    Op   string      `json:"op" protobuf:"bytes,3,opt,name=op"`
    Args []BeidouArg `json:"args" protobuf:"bytes,4,opt,name=args"`
}

_// BeidouArg is the spec of arg._
type BeidouArg struct {
    Name        string                   `json:"name" protobuf:"bytes,1,opt,name=name"`
    Value       string                   `json:"value,omitempty" protobuf:"bytes,2,opt,name=value"`
    Reference   BeidouOperationReference `json:"reference,omitempty" protobuf:"bytes,3,opt,name=reference"`
    Tolerations []corev1.Toleration      `json:"tolerations,omitempty" protobuf:"bytes,4,opt,name=tolerations"`
    Checking    []string                 `json:"checking,omitempty" protobuf:"bytes,5,opt,name=checking"`
}

_// BeidouOperationReference is the spec of operation reference._
type BeidouOperationReference struct {
    ID string `json:"id" protobuf:"bytes,1,opt,name=id"`
}

_// BeidouReference is the spec of reference._
type BeidouReference struct {
    ID    string       `json:"id" protobuf:"bytes,1,opt,name=id"`
    Steps []BeidouStep `json:"steps" protobuf:"bytes,2,opt,name=steps"`
}

_// BeidouStatus represents the current state of a Beidou._
type BeidouStatus struct {
    Message string `json:"message" protobuf:"bytes,1,opt,name=message"`
}

_// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object_

_// BeidouList is a collection of Beidou._
type BeidouList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata" protobuf:"bytes,1,opt,name=metadata"`

    Items []Beidou `json:"items" protobuf:"bytes,2,opt,name=items"`
}

核心流程：

_// ExecSteps executes steps._
func ExecSteps(ctx context.Context, steps []v1alpha1.BeidouStep, references []v1alpha1.BeidouReference) error {
    logger, _ := ctx.Value(CtxLogger).(*log.Entry)

    var hasMonitored bool
    for i, step := range steps {
        for j, op := range step.Operations {
            switch op.Op {
            case "DeploymentCreationEfficiency":
                if !hasMonitored {
                    defer func() {
                        err := monitor.Output()
                        if err != nil {
                            logger.Errorf("Failed to output: %s", err)
                        }
                    }()
                }
                hasMonitored = true
            }

            err := ExecOperation(ctx, op, references)
            if err != nil {
                return fmt.Errorf("failed to run operation %s: %s", op.Name, err)
            }
        }
    }

    return nil
}

_// ExecOperation executes operation._
func ExecOperation(ctx context.Context, op v1alpha1.BeidouOperation, references []v1alpha1.BeidouReference) error {
    switch op.Type {
    case v1alpha1.BeidouTask:
        if !tasks.IsRegistered(op.Op) {
            return ErrNotRegistered
        }

        if !tasks.DoesSupportReference(op.Op) {
            return ExecTask(ctx, op.Op, op.Args)
        }

        return ExecTaskWithRefer(ctx, op.Op, op.Args, references)
    }

    return nil
}

_// ExecTask executes a task._
func ExecTask(ctx context.Context, opname string, args []v1alpha1.BeidouArg) error {
    switch opname {
    case tasks.CreateNamespace:
        var ns string
        for _, arg := range args {
            switch arg.Name {
            case "NS":
                ns = arg.Value
            }
        }

        return op.CreateNamespace(ctx, ns)
    _// ..._
    }
    _// ..._
}

_// ExecTaskWithRefer executes a task with reference._
func ExecTaskWithRefer(ctx context.Context, opname string, args []v1alpha1.BeidouArg, references []v1alpha1.BeidouReference) error {
    switch opname {
    case tasks.RepeatNTimes:
        var times int
        var steps []v1alpha1.BeidouStep
        var err error
        for _, arg := range args {
            switch arg.Name {
            case "TIMES":
                times, err = strconv.Atoi(arg.Value)
                if err != nil {
                    return ErrParseArgs
                }
            case "ACTION":
                for _, refer := range references {
                    if refer.ID == arg.Reference.ID {
                        steps = refer.Steps
                        break
                    }
                }
            }
        }

        return RepeatNTimes(ctx, times, steps)
    }

    return ErrNotImplemented
}

操做原語的實現示例：

// PodAnnotations is an operation used to check whether annotations of Pod are expected.
func PodAnnotations(ctx context.Context, data PodAnnotationsData) error {
    kclient, ok := ctx.Value(tasks.KubernetesClient).(kubernetes.Interface)
    if !ok {
        return tasks.ErrNoKubernetesClient
    }

    pods, err := kclient.CoreV1().Pods(data.Namespace).List(metav1.ListOptions{})
    if err != nil {
        return fmt.Errorf("failed to list pods in ns %s: %s", data.Namespace, err)
    }

    for _, pod := range pods.Items {
        if pod.Annotations == nil {
            return fmt.Errorf("pod %s in ns %s has no annotations", pod.Name, data.Namespace)
        }

        for _, annotation := range data.Exists {
            if _, exists := pod.Annotations[annotation]; !exists {
                return fmt.Errorf("annotation %s does not exist in pod %s in ns %s", annotation, pod.Name, data.Namespace)
            }
        }

        for k, v := range data.Equal {
            if pod.Annotations[k] != v {
                return fmt.Errorf("value of annotation %s is not %s in pod %s in ns %s", k, v, pod.Name, data.Namespace)
            }
        }
    }

    return nil
}