kubernetes_tb寶滿

時間 2019-11-12
標籤 kubernetes 简体版
原文原文鏈接
master:kubectl操做api-server 增刪改查，scheduler調度 ，controller manager控制應用副本，etcd存儲應用狀態

node：pod，kube-proxy

Master：集羣控制節點，負責整個集羣的管理和控制。
API Server： 提供接口，資源增刪改查入口。
Controller Manager: 全部資源對象的自動化控制中心。
Scheduler： 負責資源調度。
Etcd：master的持續狀態都存在etcd。

Node：工做節點，遵從master的工做分配。
Kubelet： Pod容器建立、啓停，集羣管理等任務。
kube-proxy： 實現service 的通訊與負載均衡組件。
Docker: Docker引擎，負責本機容器建立和管理工做。

 

Pod:

是在K8s集羣中運行部署應用或服務的最小單元，它是能夠支持多容器的。Pod的設計理念是支持多個容器在一個Pod中共享網絡地址和文件系統，能夠經過進程間通訊和文件共享這種簡單高效的方式組合完成服務。
 

RC:

replication controller是K8s集羣中最先的保證Pod高可用的API對象。經過監控運行中的Pod來保證集羣中運行指定數目的Pod副本。指定的數目能夠是多個也能夠是1個；少於指定數目，RC就會啓動運行新的Pod副本；多於指定數目，RC就會殺死多餘的Pod副本。即便在指定數目爲1的狀況下，經過RC運行Pod也比直接運行Pod更明智，由於RC也能夠發揮它高可用的能力，保證永遠有1個Pod在運行。

 

service: 

一個Pod只是一個運行服務的實例，隨時可能在一個節點上中止，在另外一個節點以一個新的IP啓動一個新的Pod，所以不能以肯定的IP和端口號提供服務。要穩定地提供服務須要服務發現和負載均衡能力。
在K8s集羣中，客戶端須要訪問的服務就是Service對象。每一個Service會對應一個集羣內部有效的虛擬IP，集羣內部經過虛擬IP訪問一個服務。

 

deployment:

部署表示用戶對K8s集羣的一次更新操做。能夠是建立一個新的服務，更新一個新的服務，也能夠是滾動升級一個服務。滾動升級一個服務，實際是建立一個新的RS，而後逐漸將新RS中副本數增長到理想狀態，將舊RS中的副本數減少到0的複合操做；這樣一個複合操做用一個RS是不太好描述的，因此用一個更通用的Deployment來描述。以K8s的發展方向，將來對全部長期伺服型的的業務的管理，都會經過Deployment來管理。 

 

replica set:

RS是新一代RC，提供一樣的高可用能力，區別主要在於RS後來居上，能支持更多種類的匹配模式。副本集對象通常不單獨使用，而是做爲Deployment的理想狀態參數使用。

 

daemon set:

長期伺服型和批處理型服務的核心在業務應用，可能有些節點運行多個同類業務的Pod，有些節點上又沒有這類Pod運行；然後臺支撐型服務的核心關注點在K8s集羣中的節點（物理機或虛擬機），要保證每一個節點上都有一個此類Pod運行。節點多是全部集羣節點也多是經過nodeSelector選定的一些特定節點。典型的後臺支撐型服務包括，存儲，日誌和監控等在每一個節點上支持K8s集羣運行的服務。

 

job:

Job是K8s用來控制批處理型任務的API對象。批處理業務與長期伺服業務的主要區別是批處理業務的運行有頭有尾，而長期伺服業務在用戶不中止的狀況下永遠運行。Job管理的Pod根據用戶的設置把任務成功完成就自動退出了。成功完成的標誌根據不一樣的spec.completions策略而不一樣：單Pod型任務有一個Pod成功就標誌完成；定數成功型任務保證有N個任務所有成功；工做隊列型任務根據應用確認的全局成功而標誌成功。

 

軟件環境：

 master

node1

node2

ubuntu 1904添加網卡：

ip link 查看網卡

vim /etc/netplan/50-cloud-init.daml  寫入網卡
root@master:~# cat /etc/netplan/50-cloud-init.yaml 
# This file is generated from information provided by
# the datasource.  Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    ethernets:
        ens33:
            dhcp4: true
        ens38:
                #dhcp4: true
            addresses: [192.168.134.130/24]
            nameservers:
                    addresses: [114.114.114.114]
            gateway4: 192.168.134.2
    version: 2

netplan apply  重啓網絡

 
-------------------centos環境安裝
#中止firewalld服務
systemctl stop firewalld && systemctl disable firewalld

#關閉selinux
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config && setenforce 0

#關閉swap設置
swapoff -a
yes | cp /etc/fstab /etc/fstab_bak
cat /etc/fstab_bak |grep -v swap > /etc/fstab

#解決流量路由不正確問題
cat <<EOF >  /etc/sysctl.d/k8s.conf
vm.swappiness = 0
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

# 使配置生效
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf

#更改hosts文件
cat >> /etc/hosts << EOF
192.168.0.20 master.example.com
192.168.0.49 node1.example.com 
192.168.0.50 node2.example.com
EOF

#安裝docker
yum -y install docker 
systemctl enable docker && systemctl start docker

#配置阿里k8s源
cat >> /etc/yum.repos.d/k8s.repo << EOF
[kubernetes]
name=kuberbetes repo
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
gpgcheck=0
EOF

#安裝kubelet/kubeadm/kubectl
yum -y install kubelet kubeadm kubectl
systemctl enable kubelet && systemctl start kubelet



-------------------centos環境安裝結束

----------------ubuntu環境安裝
swapoff -a
yes | cp /etc/fstab /etc/fstab_bak
cat /etc/fstab_bak |grep -v swap > /etc/fstab

#解決流量路由不正確問題
cat <<EOF >  /etc/sysctl.d/k8s.conf
vm.swappiness = 0
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

# 使配置生效
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf

#更改hosts文件
cat >> /etc/hosts << EOF
192.168.2.150 master.example.com
192.168.2.151 node1.example.com
192.168.2.152 node2.example.com
EOF

apt-get -y install docker.io
systemctl enable docker && systemctl start docker

apt-get update && apt-get install -y apt-transport-https curl

 cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF

curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - 

apt-get update
apt-get -y install kubelet=1.13.1-00 kubeadm=1.13.1-00 kubectl=1.13.1-00 kubernetes-cni=0.6.0-00
systemctl enable kubelet && systemctl start kubelet

ubuntu列舉軟件版本：
apt-cache madison kubelet 
----------------ubuntu環境安裝結束


-------------------------------python3 自動拉取image的腳本：
import os

S_registry = 'registry.cn-beijing.aliyuncs.com/kubernetesdevops/'
D_registry = 'k8s.gcr.io/'

master_image = ['kube-apiserver:v1.13.1','kube-controller-manager:v1.13.1',
                'kube-scheduler:v1.13.1','kube-proxy:v1.13.1','pause:3.1',
                'etcd:3.2.24','coredns:1.2.6','flannel:v0.10.0-amd64','kubernetes-dashboard-amd64:v1.10.0']

def PullImage(registry,images):

    index = 1
    for image in images:

        cmd = "docker pull " + registry + image
        os.system(cmd)
        print("done!")
        index +=1

def TagImage(sregistry,dregistry,images):
    index = 1
    for image in images:

        cmd = "docker tag " + sregistry+image + " " +  dregistry + image
        os.system(cmd)
        print("done!")
        index +=1

if __name__ == '__main__':
    PullImage(S_registry,master_image)
    #TagImage(S_registry,D_registry,master_image)

-----------------------------python3 自動拉取images腳本結束

刪除重複的tag：
for i in `docker images |grep beijing|awk '{print $1":"$2}'`;do docker rmi $i;done

拉取上面8個鏡像，node1，2也都須要


-------------master初始化
kubeadm init --kubernetes-version=v1.13.1 --apiserver-advertise-address 192.168.134.130 --pod-network-cidr=10.244.0.0/16

kubeadm init --kubernetes-version=v1.13.1 --apiserver-advertise-address 192.168.134.130 --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=SystemVerification

輸出：
Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.134.130:6443 --token dyfl39.tsd8zqpfaehj8l9b --discovery-token-ca-cert-hash sha256:7e1358ca2c2c2edce1e548e0690ed1327fb41eb8150bb543794e8b7f48c654cd

-----輸出結束
-----聲明一下kubeconfig
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile
source /etc/profile

kubectl get nodes   發現notReady


安裝flannel
curl https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml -O

必定要改鏡像！打tag
docker tag k8s.gcr.io/flannel:v0.10.0-amd64 quay.io/coreos/flannel:v0.10.0-amd64


kubectl apply -f kube-flannel.yml

查看pods是否在運行
root@master:~# kubectl get pods --all-namespaces
NAMESPACE     NAME                             READY   STATUS    RESTARTS   AGE
kube-system   coredns-86c58d9df4-9h9sw         1/1     Running   0          50m
kube-system   coredns-86c58d9df4-t9s9d         1/1     Running   0          50m
kube-system   etcd-master                      1/1     Running   0          49m
kube-system   kube-apiserver-master            1/1     Running   0          49m
kube-system   kube-controller-manager-master   1/1     Running   0          49m
kube-system   kube-flannel-ds-amd64-m89kt      1/1     Running   0          2m24s
kube-system   kube-proxy-ln4qs                 1/1     Running   0          50m
kube-system   kube-scheduler-master            1/1     Running   0          49m


root@master:~# kubectl get nodes
NAME     STATUS   ROLES    AGE   VERSION
master   Ready    master   52m   v1.13.1

--------------master初始化結束



用kubeadm部署node1
注意node也要打flannel的tag

----------------------kubeadm安裝dashboard
在master安裝dashboard
先拉取鏡像
docker pull registry.cn-beijing.aliyuncs.com/kubernetesdevops/kubernetes-dashboard-amd64:v1.10.0
或者用這個：
 docker pull registry.cn-shanghai.aliyuncs.com/coolyeah/kubernetes-dashboard-amd64:v1.10.1

再tag一下
docker tag registry.cn-beijing.aliyuncs.com/kubernetesdevops/kubernetes-dashboard-amd64:v1.10.0 k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0

去github找yaml文件：
https://github.com/kubernetes/dashboard/tree/v1.10.0
把yaml下載下來
curl -O https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.0/src/deploy/recommended/kubernetes-dashboard.yaml

apply一下
kubectl apply -f kubernetes-dashboard.yaml
查看一下：
kubectl get pods --all-namespaces
發現起不來，再看一下詳細狀況：
 kubectl describe pods kubernetes-dashboard-79ff88449c-w55xl -n kube-system
發現Error:ErrImagePull
Failed to pull image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0": rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
原來這個在node1上，因此node1，也要這個dashboard的image

能夠delete掉：
kubectl delete -f kubernetes-dashboard.yaml

從新apply：
kubectl apply -f kubernetes-dashboard.yaml

查看一下：
kubectl get pods --all-namespaces
kubectl describe pods  kubernetes-dashboard-79ff88449c-554h9 -n kube-system

查看一下dashboard暴露的端口：
kubectl get service --namespace=kube-system

準備訪問master的443端口：
http://192.168.134.130:443  發現訪問不了
能夠在161行新增nodePort:31234
164行新增type: NodePort  ,用nodePort方式啓動
-------好比：
kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  ports:
    - port: 443
      targetPort: 8443
      nodePort:31234
  selector:
    k8s-app: kubernetes-dashboard
  type: NodePort
-----好比結束

刪除dashboard：
kubectl delete -f kubernetes-dashboard.yaml

重建:
kubectl apply -f kubernetes-dashboard.yaml

查看svc：
root@master:~# kubectl get svc --namespace=kube-system
NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
kube-dns               ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP   26h
kubernetes-dashboard   NodePort    10.109.246.38   <none>        443:31234/TCP   3s

訪問master機器的31234端口：
https://192.168.2.150:31234

令牌,先要獲取令牌：
kubectl get secret -n kube-system
發現kubernetes-dashboard-token-p9kvp
kubectl describe secret kubernetes-dashboard-token-p9kvp -n kube-system
把很長的token輸入到web界面裏
登錄進去發現有黃色告警信息

vim kube-user.yml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: admin
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
subjects:
  - kind: ServiceAccount
    name: admin
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

kubectl apply -f kube-user.yml
此次grep admin:
kubectl get secret -n kube-system | grep admin
最後獲取令牌：
kubectl describe secret admin-token-q42p6 -n kube-system
登出web，從新用令牌登錄



小技巧：
拉取別人的鏡像，放到本身的來雲上面。
先pull 如：
docker pull registry.cn-beijing.aliyuncs.com/kubernetesdevops/kubernetes-dashboard-amd64:v1.10.0
再tag:
docker tag registry.cn-beijing.aliyuncs.com/kubernetes-alex/kubernetes-dashboard-amd64:v1.10.0
登錄倉庫：
docker login registry.cn-beijing.aliyuncs.com
最後push上去
docker push registry.cn-beijing.aliyuncs.com/kubernetes-alex/kubernetes-dashboard-amd64:v1.10.0

=============================

kubectl命令行工具經常使用命令運營
# kubectl命令


## 命令行語法
    kubectl [command ]  [TYPE] [NAME] [flags]
    
    command: create/delete/get/describe/apply
    type: 資源對象類型,嚴格區分大小寫。   
    name: 資源對象的名稱，嚴格區分大小寫。
    flags: 可選參數  -n 指定namespaces

## 資源對象類型
    daemonsets   ds
    deployments 
    events    ev  事件
    endpoints  ep  
    horizontalpodautoscalers   hpa    水平擴展
    ingresses   ing
    jobs
    nodes   no
    namespaces ns
    pods po
    persistentvolumes pv   物理卷
    persistentvolumesclaims   pvc   物理卷組
    resourcequotas  quota
    replicationcontrollers rc
    secerts 
    service   svc
    serviceaccounts sa

## 練習
同時查看多種資源對象
```
kubectl get pod/etcd-master.example.com  svc/kubernetes-dashboard -n kube-system
```



## kubectl 子命令
    annotate  添加或者更新資源對象的信息
    apply  kubectl apply -f filename    從配置文件更新資源對象
    attach  kubectl attach pod -c container 連接正在運行的pod
    cluster-info kubectl cluster-info  顯示集羣信息
    completion kubectl completion bash 輸出shell命令執行後的返回碼
    config kubectl config get-clusters  修改kubeconfig配置文件
    create kubectl create -f kube-user.yml 從配置文件建立資源對象
    delete kubectl delete -f kube-user.yml  從配置文件刪除資源對象
    describe  kubectl describe sa  查看資源對象的詳細信息
    edit kubectl edit sa 編輯資源對象的屬性
    exec kubectl exec coredns-86c58d9df4-d8x49 ls -n kube-system   執行一個容器中的命令
    label kubectl label node node1.example.com a=b  爲資源對象建立label標記



------------------------------------------
經過yaml建立pod對象
小技巧：
查看其餘pod的狀況並用yaml方式顯示出來
kubectl edit pod coredns-86c58d9df4-9h9sw -n kube-system
學習怎麼寫yaml能夠查看github kubernetes的handbook
https://github.com/feiskyer/kubernetes-handbook
https://github.com/kubernetes/examples
https://github.com/kubernetes/examples/tree/master/guestbook

小技巧：
更名全部下下來的tag：
 for i in `docker images|grep gcr|awk '{print $1":"$2}'|cut -d"/" -f2`;do docker tag k8s.gcr.io/$i registry.cn-shanghai.aliyuncs.com/alexhjl/$i;done
上傳到阿里雲的鏡像倉庫：
 for i in `docker images |grep shanghai`;do docker push $i;done

建立一個nginx.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: registry.xxx.com/nginx:latest
    imagePullPolicy: IfNotPresent
    ports:
    - containerPort: 80

IfNotPresent，當本地鏡像不存在時會pull

kubectl create -f nginx.yaml
若是不指定namespace 會在當前的namespace中建立pod
kubectl get pod
查看pod在哪一個node運行：
kubectl describe pod nginx
刪除pod：
kubectl delete pod nginx   或者 kubectl delete -f nginx.yaml


master節點爲何不運行pod？
由於master節點有個標記，kubectl get node --show-labels 能夠顯示全部標記


pod的生命週期：
Pending： 記錄狀態  下載鏡像
Running：調度完成 
Successed: 程序終止  
Failed: 有未終止容器
Unknown:
kubectl先根據yaml定義來建立pod，pod的相關信息存入到etcd上。apiserver返回確認的信息給客戶端。apiserver開始查詢etcd中的狀態變化，調度器就會查看資源對象是否綁定到哪一個節點，沒有綁定節點就會爲pod挑選一個節點，調度到這個節點上來。若是已經定義了綁定到存在的節點中，則調度器調度到節點中，調度成功後把狀態結果反饋到etcd。


kubernetes設計理念
pdf 第49頁

kubernetes核心技術概念和api對象
pdf第51頁

k8s核心組件與通訊端口
pdf58頁

kube-apiserver原理分析：
kubectl api-versions //查詢api支持的版本
root@master:~# kubectl api-versions
admissionregistration.k8s.io/v1beta1
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1
apiregistration.k8s.io/v1beta1
apps/v1
apps/v1beta1
apps/v1beta2
authentication.k8s.io/v1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1
authorization.k8s.io/v1beta1
autoscaling/v1
autoscaling/v2beta1
autoscaling/v2beta2
batch/v1
batch/v1beta1
certificates.k8s.io/v1beta1
coordination.k8s.io/v1beta1
events.k8s.io/v1beta1
extensions/v1beta1
networking.k8s.io/v1
policy/v1beta1
rbac.authorization.k8s.io/v1
rbac.authorization.k8s.io/v1beta1
scheduling.k8s.io/v1beta1
storage.k8s.io/v1
storage.k8s.io/v1beta1
v1

kubectl api-resources --api-group=apps/v1     //查詢資源對象

kubectl get --raw /api/v1/namespaces

kubectl proxy --port=8080 &      //開啓本地代理網關，也能夠不加--port=8080


----------------------------
kubernetes scheduler
負責pod的生命週期中的後半部分。
經過apiserver查詢未分配node的pod，根據調度策略調度pod。
一般經過RC/Deployment/Daemonset/Job完成pod的調度。

scheduler工做原理：
1.API建立新的pod
2.Controller Manager 補充pod的副本
3.Scheduler 按照特定的調度算法綁定到集羣中匹配的node上
4.綁定成功，將綁定信息寫入etcd

scheduler節點調度：
nodeSelector（定向調度）：
    調度到label匹配的node節點中
nodeAffinity（親和性調度）：
    調度到label匹配的node節點中（能夠設置優選，匹配更豐富）
    requiredDuringSchedulingRequiredDuringExecution（相似於selector）
    requiredDuringSchedulingIgnoredDuringExecution   (知足條件)
    preferredDuringSchedulingIgnoredDuringExecution（優選條件）
podAffinity：
    調度到匹配的pod運行的node節點中。

nodeAffinity
requiredDuringSchedulingRequiredDuringExecution： 當pod不知足條件時，系統將從該node上移除以前調度的pod。
requiredDuringSchedulingIgnoredDuringExecution：與上面相似，區別： 當node條件不知足時，系統不必定要從該node上移除以前調度的pod。
preferredDuringSchedulingIgnoredDuringExecution： 指定在知足調度條件的node中，那些node應該更優先的進行調度。 同時當node不知足條件時，系統不必定從移除以前調度的pod。

給node節點建立label標記
kubectl label nodes node01.example.com cpucounts=four
查看node的標記
kubectl get nodes --show-labels

在資源對象中添加調度設置
spec:
  nodeSelector:
      cpucounts: four

實驗步驟：
1.建立一個pod，添加調度信息。
2.查看pod的運行狀態
3.給node節點建立匹配的label
4.驗證pod運行正常

若是沒有匹配的node，Pod狀態將會一直處於pending狀態。

cat nodeselector_nginx.yaml   上面給node01帶了個標籤cpucounts=four,這裏就會匹配到
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  nodeSelector:
    cpucounts: four
  containers:
  - name: nginx
    image: docker.io/library/nginx     
    imagePullPolicy: IfNotPresent
    ports:
    - containerPort: 80

kubectl apply -f nodeselector_nginx.yaml
若是不啓動，describe一下看詳細狀況
kubectl describe pod nginx

 cat nodeAffinity-R.yaml  當前pod調度到擁有標籤area=test1/test2的節點上
apiVersion: v1
kind: Pod
metadata:
  name: testschduler
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: area
            operator: In
            values:
            - test1
            - test2
  containers:
  - name: myapp
    image: nginx
requiredDuringSchedulingIgnoredDuringExecution： 
    1.值爲一個對象列表，由一個到多個nodeSelectorTerm定義的對象組成。
    2.多個nodeSelectorTerm之間只要知足其中一個便可。

nodeSelectorTerm： 
    1.用於定義節點選擇器條目，由一個或多個matchExpressions對象定義的匹配規則組成。
    2.規則之間 邏輯與關係，必須知足一個nodeSelectorTerm下的全部matchExpressions。

matchExpressions：
    由一個或多個標籤選擇器組成
operator: 
    標籤選擇器表達式中的操做符
    經常使用： In NotIn Exists DoesNotExist Lt Gt






cat nodeAffinity-P.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: testschduler
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - wight: 90
        preference:
          matchExpressions:
          - key: area
            operator: In
            values:
            - test1
            - test2
      - wight: 80
        preference:
          matchWxpressions:
          - key: vm
            operator: Exists
            values: []

  containers:
  - name: myapp
    image: nginx
當前pod調度到擁有標籤
area=test1/test2 或者存在vm標籤的節點上。

wight=170  area=test1 vm=true
wight=90 area=test1 
wight=80 vm=true








cat podAffinity-R.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: testschduler
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
        matchExpressions:
          - key: area
            operator: In
            values:
            - test1
            - test2
        topologyKey: kubernetes.io/hostname
經過labelSelector選擇pod，根據pod對象所在 


--------------------------------------------
Controller manager
由kube-controller-manager 和cloud-controller-manager組成。
controller manager是 kubernetes的大腦。
經過apiserver監控整個集羣的狀態，確保符合預期狀態。



Metrics 度量： （性能參數）
controller manager metrics 提供了控制器內部邏輯的性能度量。
默認監聽 10252端口（prometheus）

Go語言運行時度量
etcd請求延時
雲服務提供商API請求延時

訪問： http://localhost:10252/metrics

------------------------------------------------
kubelet
每一個節點都運行一個kubelet進程，默認端口10250。
接受並執行master發來的指令，管理pod和其中的容器。
每一個kubelet都會向apiserver註冊自身信息。
按期向master節點彙報當前節點的狀況。
經過cAdvisor監控節點和容器的資源。

kubelet監聽10250端口



容器健康檢查：
LivenessProbe探針
用於判斷容器是否健康。若是探測不健康則經過kubelet刪除該容器。
若是一個容器不包含探針kubelet會認爲值爲success。

ReadinessProbe
用於判斷容器是否啓動完成且準備接收請求。若是探測到失敗，則pod的狀態被修改。

來診斷容器的健康狀態。kubelet按期調用容器中的livenessprobe探針

實現方式
ExecAction: 在容器內部執行一條命令，若是退出狀態碼爲0，則代表容器健康。
TCPSocketAction: 經過容器的IP地址和端口號執行TCP檢查，若是端口可以被訪問則代表容器健康。
HTTPGetAction: 經過容器的IP地址和端口及路徑調用HTTP GET方法，若是狀態碼大於等於200且小於400，則認爲容器健康。



cAdvisor資源監控：
一個開源的分析容器資源使用率和性能特性的代理工具。
自動查找全部在其所在的節點上的容器，自動採集CPU、內存、文件系統和網絡使用的統計信息。
cAdvisor經過其所在的節點機的4194端口暴露一個UI。


cr容器運行時：container runtime
容器運行時是Kubernetes最重要的組件之一。
負責真正管理鏡像和容器的生命週期。
kubelet經過CRI與容器運行時交互，以管理鏡像和容器。


獲取node的性能指標
看看各node上面有沒有10255端口，若是沒啓動的話。
能夠修改
 vim /var/lib/kubelet/kubeadm-flags.env 
加入
--read-only-port=10255
完整行：
KUBELET_KUBEADM_ARGS=--read-only-port=10255 --cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1 --resolv-conf=/run/systemd/resolve/resolv.conf
systemctl restart kubelet
或者：
vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
添加一行
Environment="KUBELET_API=--read-only-port=10255"
修改一行
ExecStart=/usr/bin/kubelet $KUBELET_API $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
完整文件：-------------完整文件開始
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
Environment="KUBELET_API=--read-only-port=10255"
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_API $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
------------------------完整文件結束
systemctl daemon-reload
systemctl restart kubelet
最後就能經過連接獲取該主機的全部狀態性能
curl 192.168.2.151:10255/stats/summary



--------------------------------------------------------------------------------
docker視頻教程
-------------------------------
Prometheus視頻
https://github.com/aaron111com/Jenkinsdocs/blob/master/chapter/Prometheus%E5%AE%89%E8%A3%85%E9%83%A8%E7%BD%B2+%E7%9B%91%E6%8E%A7+%E7%BB%98%E5%9B%BE+%E5%91%8A%E8%AD%A6.md

1.安裝node_export在/usr/local
tar zxvf node_exporter-0.18.1.linux-amd64.tar.gz -C /usr/local/

vim /etc/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
After=network.target

[Service]
Restart=on-failure
ExecStart=/usr/local/node_exporter-0.18.1.linux-amd64/node_exporter

[Install]
WantedBy=multi-user.target

systemctl start node_export
systemctl status node_exporter
systemctl enable node_exporter
netstat -ntulp| grep 9100    #node_exporter 監聽9100端口
瀏覽器能夠訪問http://192.168.2.150:9100/metrics


2.安裝prometheus
tar zxvf prometheus-2.12.0.linux-amd64.tar.gz -C /usr/local/
vim /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target

[Service]
Restart=on-failure
ExecStart=/usr/local/prometheus-2.12.0.linux-amd64/prometheus --config.file=/usr/local/prometheus-2.12.0.linux-amd64/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --web.external-url=http://0.0.0.0:9090

[Install]
WantedBy=multi-user.target

systemctl start prometheus
netstat -ntulp| grep 9090
systemctl enable prometheus

靜態配置添加一臺主機：
vim /usr/local/prometheus-2.12.0.linux-amd64/prometheus.yml
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']
  # add custom monitor job,monitor node_exporter 經過靜態方式添加一臺主機，監控node_expoter
  - job_name: "my target"                   
    static_configs:
    - targets: ['localhost:9100'] 

添加完，重啓prometheus
systemctl restart prometheus

打開瀏覽器頁面
http://192.168.2.150:9090
訪問status->targets


服務發現：
動態發現，事業雲環境，動態伸縮，迅速配置。
kubernetes爲例：
須要配置api的地址和認證憑據。
prometheus一直監聽集羣的變化
獲取新增/刪除集羣中機器的信息，並更新採集對象列表。


prometheus數據存儲：
本地存儲：經過自帶的時序數據庫將數據保存到本地的磁盤。
遠端存儲：適用於存儲大量的監控數據，支持opentsdb，influxdb，elasticsearch等後端存儲。經過適配器實現存儲的遠程讀寫接口，即可以監控。


安裝grafana
https://mirrors.tuna.tsinghua.edu.cn/help/grafana/
Debian / Ubuntu 用戶
首先信任 https://packages.grafana.com/ 的 GPG 公鑰:
curl https://packages.grafana.com/gpg.key | sudo apt-key add -

確保你的 apt 支持 HTTPS:
sudo apt-get install -y apt-transport-https
選擇你但願安裝的 Grafana 版本（與你的 Debian/Ubuntu 系統版本無關），文本框中內容寫進 /etc/apt/sources.list.d/grafana.list

你的 Grafana 版本: 
deb https://mirrors.tuna.tsinghua.edu.cn/grafana/apt/ stable main
安裝 Grafana
sudo apt-get update
sudo apt-get install grafana
啓動：
systemctl start grafana-server
訪問3000端口
192.168.2.150:3000

添加數據源
Prometheus，黏貼地址，dashboard裏導入3個

訪問這個裝主機監控插件，監控cpu，內存，磁盤，網絡
https://grafana.com/dashboards/9276
導入grafana：
選擇dashboard
import，而後導入json。



數據告警：
alertmanager
 tar zxvf alertmanager-0.19.0.linux-amd64.tar.gz -C /usr/local/
 cat /etc/systemd/system/alertmanager.service
[Unit]
Description=Alertmanager
After=network-online.target

[Service]
Restart=on-failure
ExecStart=/usr/local/alertmanager-0.19.0.linux-amd64/alertmanager --config.file=/usr/local/alertmanager-0.19.0.linux-amd64/alertmanager.yml

[Install]
WantedBy=multi-user.target

systemctl start alertmanager
systemctl enable alertmanager

netstat -ntulp| grep 9093

瀏覽器訪問http://192.168.2.150:9093
修改Prometheus的配置文件：
root@master:/usr/local/prometheus-2.12.0.linux-amd64# cat prometheus.yml 
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - localhost:9093         #修改爲alertmanager的地址

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rules/host_rules.yml"     #新建文件夾rules和下面yml
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']
  # add custom monitor job,monitor node_exporter 
  - job_name: "my target"                   
    static_configs:
    - targets: ['localhost:9100']


mkdir rules
vim rules/host_rules.yml
groups:
- name: 'Linux Instances'
  rules:
  - alert: InstanceDown
    expr: up == 0
    for: 5s
    labels:
      severity: page
   # Prometheus templates apply here in the annotation and label fields of the alert.
    annotations:
      description: 'has been down for more than 5 s.'
重啓Prometheus
systemctl restart prometheus

這時候把node_exporter關掉
systemctl stop node_exporter
prometheus的界面就會報警

若是要配置郵件報警：
alertmanager須要配置一下，
global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.qq.com:465'
  smtp_from: 'xxxxx@qq.com'
  smtp_auth_username: 'xxxx@qq.com'
  smtp_auth_password: 'xxxkbpfmygbecg'
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'toemail'
receivers:
- name: 'toemail'
  email_configs:
  - to: 'xxxxx@qq.com'
    send_resolved: true
- name: 'web.hook'
  webhook_configs:
  - url: 'http://127.0.0.1:5001/'
inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

systemctl restart alertmanager
相關標籤/搜索
每日一句
每一个你不满意的现在，都有一个你没有努力的曾经。