Kubernetes is an open-source platform for automating deployment, scaling, and operations of application containers across clusters of hosts, providing container-centric infrastructure.html
特性:node
其餘特色:mysql
總結:調度,管理,擴展(deployment/demon set/stateful set/job, health check,auto-scaling,rolling updates)應用程序,提供應用程序運行平臺(日誌,監控,服務發現,負載均衡,鑑權),以及管理控制和分配平臺資源(內存,cpu,網絡,存儲,鏡像)linux
咱們看一下操做系統的定義
操做系統(Operating System, OS)是指控制和管理整個計算機系統的硬件和軟件資源,併合理地組織調度計算機的工做和資源的分配,以提供給用戶和其餘軟件方便的接口和環境的程序集合. kubernetes就是一個分佈式的操做系統,它管理一個計算機集羣的軟件和硬件資源,而且合理的組織調用程序(容器)和資源的分配,以提供給用戶和其餘軟件方便的接口和環境。
單機操做系統中的大多概念 都在k8s有或者正在有對應的形態。舉個例子systemctl有reload操做,這個k8s也沒有,可是是k8s正在作的。nginx
這段頗有意思,很值得看,Kubernetes不是什麼,裏面不少都是Kubernetes發行商須要考慮和完成的事git
middleware
(e.g., message buses), data-processing frameworks (for example, Spark), databases (e.g., mysql), nor cluster storage systems (e.g., Ceph) as built-in services. Such applications run on Kubernetes.click-to-deploy service marketplace
.Continuous Integration (CI) workflow
is an area where different users and projects have their own requirements and preferences, so it supports layering CI workflows on Kubernetes but doesn’t dictate how layering should work.logging
, monitoring
, and alerting systems
. (It provides some integrations as proof of concept.)comprehensive application configuration language/system
(for example, jsonnet).comprehensive machine configuration, maintenance, management, or self-healing systems
.角色 | 組件 | 說明 | ||
---|---|---|---|---|
Master Components | kube-apiserver | kube-apiserver exposes the Kubernetes API; | ||
- | - | it is the front-end for the Kubernetes control plane. | ||
Master Components | etcd | Kubernetes’ backing store. stored All cluster data | ||
Master Components | kube-controller-manager | 一個binary包括: | ||
- | - | 1.Node Controller: noticing & responding when nodes go down. | ||
- | - | 2.Replication Controller:maintain correct number of pods for every Replication Controller object. - | - | 3.Endpoints Controller: Populates the Endpoints object (如join Services & Pods). |
- | - | 4.Service Account & Token Controllers:Create default accounts,API access tokens for namespaces. | ||
- | - | 5.others. | ||
Master Components | cloud-controller-manager | a binary run controllers interact with cloud providers.包括: | ||
- | - | 1.Node Controller: checking cloud provider,determine if node deleted in cloud after stops responding | ||
- | - | 2.Route Controller: For setting up routes in the underlying cloud infrastructure | ||
- | - | 3.Service Controller: For creating, updating and deleting cloud provider load balancers | ||
- | - | 4. Volume Controller: For creating,attaching,mounting,interacting with cloud provider to orchestrate volumes | ||
Master Components | kube-scheduler | kube-scheduler watches newly created pods that have no node assigned, and selects a node for them to run on. | ||
Master Components | addons | Addons are pods and services that implement cluster features. | ||
- | - | 如:DNS (Cluster DNS is a DNS server, in addition to the other DNS server(s) in your environment, which serves DNS records for Kubernetes services.), | ||
- | - | User interface,Container Resource Monitoring,Cluster-level Logging | ||
Node components | kubelet | primary node agent,主要功能: | ||
- | - | 1.Watches for pods that have been assigned to its node (either by apiserver or via local configuration file) | ||
- | - | 2.Mounts the pod’s required volumes | ||
- | - | 3.Downloads the pod’s secrets | ||
- | - | 4.Runs the pod’s containers via docker (or, experimentally, rkt). | ||
- | - | 5.Periodically executes any requested container liveness probes. | ||
- | - | 6.Reports the status of the pod back to the rest of the system, by creating a 「mirror pod」 if necessary | ||
- | - | 7.Reports the status of the node back to the rest of the system. | ||
Node components | kube-proxy | kube-proxy enables the Kubernetes service abstraction by maintaining network rules on the host and performing connection forwarding. | ||
Node components | docker/rkt | for actually running containers. | ||
Node components | supervisord | supervisord is a lightweight process babysitting system for keeping kubelet and docker running. | ||
Node components | fluentd | fluentd is a daemon which helps provide cluster-level logging. |
分類github
類別 | 名稱 |
---|---|
資源對象 | Pod、ReplicaSet、ReplicationController、Deployment、StatefulSet、DaemonSet、Job、CronJob、HorizontalPodAutoscaling |
配置對象 | Node、Namespace、Service、Secret、ConfigMap、Ingress、Label、ThirdPartyResource、 ServiceAccount |
存儲對象 | Volume、Persistent Volume |
策略對象 | SecurityContext、ResourceQuota、LimitRange |
Kubernetes Objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster. Specifically, they can describe:web
Kubernetes Objects描述desired state
=> 狀態驅動
redis
Kubernetes對象就是應用,資源和策略sql
每一個對象都有兩個嵌套的字段Object Spec 和 Object StatusObject Spec描述desired的狀態
,Object Status 描述當前狀態
. Object Status -》match Object Spec
Kubernetes Control Plane就是要讓 object’s actual state => object's desired state
略
Labels are key/value pairs that are attached to objects, such as pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but which do not directly imply semantics to the core system
.
不惟一
Via a label selector, the client/user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.
The API currently supports two types of selectors: equality-based
(如:environment = production)and set-based
(如:environment in (production, qa)).
例子見 kubernetes.io/docs/concep…
label 可用在 LIST and WATCH filtering;Set references in API objects
Some Kubernetes objects, such as services and replicationcontrollers
, also use label selectors to specify sets of other resources, such as pods.可是支持equality-based requirement selectors
"selector": {
"component" : "redis",
}複製代碼
Newer resources, such as Job, Deployment, Replica Set, and Daemon Set
, support set-based requirements as well.這些資源,同時支持set-based requirements
selector:
matchLabels:
component: redis
matchExpressions:
- {key: tier, operator: In, values: [cache]}
- {key: environment, operator: NotIn, values: [dev]}複製代碼
另外一個使用場景事用label來選擇node
做用是Attaching metadata to objects
和label有區別:
You can use either labels or annotations to attach metadata to Kubernetes objects. Labels can be used to select objects and to find collections of objects that satisfy certain conditions. In contrast, annotations are not used to identify and select objects. The metadata in an annotation can be small or large, structured or unstructured, and can include characters not permitted by labels.
Complete API details are documented using Swagger v1.2
and OpenAPI
(就是Swagger 2.0).
如:/api/v1, 根據穩定性分爲 stabel(v1), alpha (v1alpha1), beta (v2beta3)
爲了方便extend Kubernetes API
Currently there are several API groups in use:
core
(oftentimes called 「legacy」, due to not having explicit group name) group, which is at REST path /api/v1 and is not specified as part of the apiVersion field, e.g. apiVersion: v1.擴展api目前有兩種方式: CustomResourceDefinition 和 kube-aggregator
某個api group能夠在apiserver啓動的時候被打開或者
關閉, 好比
--runtime-config=extensions/v1beta1/deployments=false,extensions/v1beta1/ingress=false複製代碼
這部分來自 github.com/kubernetes/…
All JSON objects returned by an API MUST have the following fields:
object內容 | 說明 |
---|---|
Metadata | MUST: namespace,name,uid; SHOULD: resourceVersion,generation,creationTimestamp,deletionTimestamp,labels,annotations |
Spec and Status | status (current) -> Spec(desired);A /status subresource MUST be provided to enable system components to update statuses of resources they manage; Status常是Conditions |
References to related objects | ObjectReference type |
PATCH比較特別,支持三種patch
All compatible Kubernetes APIs MUST support "name idempotency" and respond with an HTTP status code 409
"confict"
Optional fields have the following properties:
使用 +optional 而不是omitempty
使用resourceVersion來作Concurrency Control
All Kubernetes resources have a "resourceVersion" field as part of their metadata.
Kubernetes leverages the concept of resource versions to achieve optimistic concurrency.
The resourceVersion is changed by the server every time an object is modified.
什麼什麼api會返回status kind類型
Kubernetes will always return the Status kind from any API endpoint when an error occurs. Clients SHOULD handle these types of objects when appropriate.
$ curl -v -k -H "Authorization: Bearer WhCDvq4VPpYhrcfmF6ei7V9qlbqTubUc" https://10.240.122.184:443/api/v1/namespaces/default/pods/grafana
> GET /api/v1/namespaces/default/pods/grafana HTTP/1.1
> User-Agent: curl/7.26.0
> Host: 10.240.122.184
> Accept: */*
> Authorization: Bearer WhCDvq4VPpYhrcfmF6ei7V9qlbqTubUc
>
< HTTP/1.1 404 Not Found
< Content-Type: application/json
< Date: Wed, 20 May 2015 18:10:42 GMT
< Content-Length: 232
<
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "pods \"grafana\" not found",
"reason": "NotFound",
"details": {
"name": "grafana",
"kind": "pods"
},
"code": 404
}複製代碼
The API therefore exposes certain operations over upgradeable HTTP connections (described in RFC 2817) via the WebSocket and SPDY protocols.
支持兩種協議
Node Status | 描述 | |
---|---|---|
Addresses | HostName/ExternalIP/InternalIP | |
Condition | OutOfDisk / Ready / MemoryPressure / DiskPressure / NetworkUnavailable | |
Capacity | ||
Info |
Node Controller
The node controller is a Kubernetes master component which manages various aspects of nodes.
做用:
The CCM consolidates all of the cloud-dependent logic from the preceding three components to create a single point of integration with the cloud. The new architecture with the CCM looks like this
The default pull policy is IfNotPresent
which causes the Kubelet to not pull an image if it already exists.
若是要強制拉取,使用imagePullPolicy: Always
, 推薦的作法是 "Vxx + IfNotPresent", 而不是"latest + Always",由於不知道正在運行的是什麼版本,可是實際上pull是調用docker這樣的runtime去pull, 即便Always也不會重複下載大量數據,由於layer已經存在來,從這方面講Always是無害的。
可用:
Using Google Container Registry
Using AWS EC2 Container Registry
Using Azure Container Registry (ACR)
經過$HOME/.docker/config.json (過時問題??)
$ kubectl create secret docker-registry myregistrykey --docker-server=DOCKER_REGISTRY_SERVER --docker-username=DOCKER_USER --docker-password=DOCKER_PASSWORD --docker-email=DOCKER_EMAIL
secret "myregistrykey" created.複製代碼
不經過kubectl也能夠從.docker/config.json的內容,用yaml建立secrets
怎麼使用建立出來的imagePullSecrets
能夠在podspec裏面指定,也能夠經過serviceaccount自動完成這個設定。
You can use this in conjunction with a per-node .docker/config.json. The credentials will be merged
. This approach will work on Google Container Engine (GKE).
apiVersion: v1
kind: Pod
metadata:
name: foo
namespace: awesomeapps
spec:
containers:
- name: foo
image: janedoe/awesomeapp:v1
imagePullSecrets:
- name: myregistrykey複製代碼
使用場景,值得注意的是 AlwaysPullImages admission controller,這個有時候要打開,好比多租戶的狀況,不然有可能獲取別人的鏡像。
具體多種掛在方式 元數據->container裏面的文件/環境變量,參考 kubernetes.io/docs/tasks/… 和相關文檔
建立的時候存在的service host/port做爲變量都會掛在container裏面(目前看是這個namespace的),這個特性保證了即便沒開dns addon,也能夠訪問service,固然這種方式不可靠。
如今有兩種 PostStart; PreStop,若是hook調用hangs,Pod狀態變化會阻塞。
支持Exec,HTTP兩種方式
從上面的特色能夠看出,PostStart; PreStop的目前的設計都是針對很是輕量級的命令,若是不是能夠考慮用initcontainer,defercontainer(還沒實現,有issue)
通常只會發一次,可是不保證
If a handler fails for some reason, it broadcasts an event.
You can see these events by running kubectl describe pod
Pod是什麼:部署的最小單位; 涵蓋了一個或多個application container,(共用的)存儲資源,網絡IP,options
A Pod encapsulates an application container (or, in some cases, multiple containers), storage resources, a unique network IP, and options that govern how the container(s) should run. A Pod represents a unit of deployment: a single instance of an application in Kubernetes, which might consist of either a single container or a small number of containers that are tightly coupled and that share resources.
參考:
blog.kubernetes.io/2015/06/the… (一個pod多個container的use case:Sidecar (git, log...), Ambassador (proxy, 透明代理),Adapter (exporter)...)
blog.kubernetes.io/2016/06/con…
一個例子:
multiple Containers共享:
Pods are designed as relatively ephemeral, disposable entities.Pods do not, by themselves, self-heal
,Kubernetes uses a higher-level abstraction, called a Controller, that handles the work of managing the relatively disposable Pod instances.
A Controller can create and manage multiple Pods for you, handling replication and rollout and providing self-healing capabilities at cluster scope
. For example, if a Node fails, the Controller might automatically replace the Pod by scheduling an identical replacement on a different Node.
Some examples of Controllers that contain one or more pods include:
Controllers use Pod Templates to make actual pods.
沒有 desired state of all replicas,不像pod,會規定desired state of all containers belonging to the pod.
A Pod’s status field is a PodStatus object, which has a phase field.
可能的狀態 | 說明 |
---|---|
Pending | The Pod has been accepted by the Kubernetes system, but one or more of the Container images has not been created. |
Running | The Pod has been bound to a node, and all of the Containers have been created. At least one Container is still running, or is in the process of starting or restarting |
Succeeded | All Containers in the Pod have terminated in success, and will not be restarted. |
Failed | All Containers in the Pod have terminated, and at least one Container has terminated in failure. |
Unknown |
pod 終止
A Pod has a PodStatus, which has an array of PodConditions.Each element of the PodCondition array has a type field and a status field.
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2017-10-28T06:30:03Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2017-10-28T06:30:13Z
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2017-10-28T06:30:03Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://dd82608cabe226247bcbc8d5fbce6121edf935320486c41046481000dbb7784f
image: deis/brigade-api:latest
imageID: docker-pullable://deis/brigade-api@sha256:943cf822adddf6869ff02d2e1a55cbb19c96d01be41e88d1d56bc16a50f5c91f
lastState: {}
name: brigade
ready: true
restartCount: 0
state:
running:
startedAt: 2017-10-28T06:30:06Z複製代碼
A Probe is a diagnostic performed periodically by the kubelet on a Container. To perform a diagnostic, the kublet calls a Handler implemented by the Container.
三種檢測方式:
三種結果: Success,Failure,Unknown
兩種類型:livenessProbe(和restart policy相關),readinessProbe
todo
Job
for Pods that are expected to terminate, for example, batch computations. Jobs are appropriate only for Pods with restartPolicy equal to OnFailure or Never.ReplicationController, ReplicaSet, or Deployment
for Pods that are not expected to terminate, for example, web servers. ReplicationControllers are appropriate only for Pods with a restartPolicy of Always.DaemonSet
for Pods that need to run one per machine
, because they provide a machine-specific system service.這裏比較值得注意的是若是pod設計成run to complete的,那麼restartPolicy不能用Always
當前pod phase | container發生事件 | pod restartPolicy | 對container的動做 | log | pod phase |
---|---|---|---|---|---|
Running | exits with success | Always | Restart Container | Log completion event | Running |
Running | exits with success | OnFailure | - | Log completion event | Succeeded |
Running | exits with success | Never | - | Log completion event | Succeeded |
Running | exits with failure | Always | Restart Container | Log failure event | Running |
Running | exits with failure | OnFailure | Restart Container | Log failure event | Running |
Running | exits with failure | Never | - | Log failure event | Failed |
Running | oom | Always | Restart Container | Log OOM event | Running |
Running | oom | OnFailure | Restart Container | Log OOM event | Running |
Running | oom | Never | - | Log OOM event | Failed |
當前pod phase | container1發生事件 | pod restartPolicy | 對container的動做 | log | pod phase |
---|---|---|---|---|---|
Running | exits with failure | Always | Restart Container | Log failure event | Running |
Running | exits with failure | OnFailure | Restart Container | Log failure event | Running |
Running | exits with failure | Never | - | Log failure event | Running, 若是container2也退出 =》Failed |
經常使用來作set-up,或者等待set-up
Init Containers are exactly like regular Containers, except:
pod preset,是一種給pod注入元數據的方法。
使用pod preset會決定對某一類的pod,在Admission controller那裏透明的對pod spec進行修改,給pod動態的注入依賴的一些信息,如env,mount volumns
表現:
當PodPreset被應用於一個或者多個Pod,Kubernetes修改pod的spec。對於Env,EnvFrom和VolumeMounts,Kubernetes修改了Pod裏面全部容器的spec;對於Volume Kubernetes修改了Pod Spec。
例子:
kind: PodPreset
apiVersion: settings.k8s.io/v1alpha1
metadata:
name: allow-database
namespace: myns
spec:
selector:
matchLabels:
role: frontend
env:
- name: DB_PORT
value: "6379"
volumeMounts:
- mountPath: /cache
name: cache-volume
volumes:
- name: cache-volume
emptyDir: {}複製代碼
參考www.jianshu.com/p/83fe99a5e…
包含 PodSecurityPolicy 的 許可控制,容許控制集羣資源的建立和修改,基於這些資源在集羣範圍內被許可的能力。
若是某個策略可以匹配上,該 Pod 就被接受。若是請求與 PSP 不匹配,則 Pod 被拒絕
unavoidable cases 即 involuntary disruptions to an application. =>好比: hardware failure,kernel panic,node disappears,eviction of a pod due to the node being out-of-resources.等等
voluntary disruptions => 好比: deleting/updating the deployment/pod, Draining a node for repair or upgrade or cluster down.
如何減輕Involuntary Disruptions的影響: 指名要的資源, Replicate and spread.
在Kubernetes中,爲了保證業務不中斷或業務SLA不降級,須要將應用進行集羣化部署。經過PodDisruptionBudget控制器能夠設置應用POD集羣處於運行狀態最低個數,也能夠設置應用POD集羣處於運行狀態的最低百分比,這樣能夠保證在主動銷燬應用POD的時候,不會一次性銷燬太多的應用POD,從而保證業務不中斷或業務SLA不降級。
使用那種調用Eviction API 的工具而不是直接刪除POD,由於Eviction API 會respect Pod Disruption Budgets,好比 kubectl drain命令。
參考:
www.kubernetes.org.cn/2486.html
ju.outofmemory.cn/entry/32756…
Write disruption tolerant applications and use PDBs
通常不直接用,而是經過Deployments.
mainly used by Deployments as a mechanism to orchestrate pod creation, deletion and updates.
A ReplicaSet ensures that a specified number of pod replicas are running at any given time.
一些操做:
略,如今不推薦了。
A Deployment controller provides declarative updates
for Pods and ReplicaSets.
Pod-template-hash label: this label ensures that child ReplicaSets of a Deployment do not overlap. It is generated by hashing the PodTemplate of the ReplicaSet and using the resulting hash as the label value that is added to the ReplicaSet selector, Pod template labels, and in any existing Pods that the ReplicaSet might have.
Deployment can ensure that only a certain number of Pods may be down while they are being updated. By default, it ensures that at least 1 less than the desired number of Pods are up (1 max unavailable).
rollout, rollout history/status, undo......
Proportional scaling: RollingUpdate (maxSurge,maxUnavailable)可能短暫大於預期數量
$ kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 10 10 10 10 50s
$ kubectl set image deploy/nginx-deployment nginx=nginx:sometag
deployment "nginx-deployment" image updated
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
nginx-deployment-1989198191 5 5 0 9s
nginx-deployment-618515232 8 8 8 1m複製代碼
You can set .spec.revisionHistoryLimit field in a Deployment to specify how many old ReplicaSets for this Deployment you want to retain
注意:目前不支持Canary Deployment,推薦用multiple Deployment來實現
since 1.5 取代PetSets,特色是:Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering
and uniqueness
of these Pods.
stateful意味着:
components of a StatefulSet.例子
A Headless Service
(帶selector), named nginx, is used to control the network domain.這種service不帶lb,kube-proxy不處理,dns直接返回後端endpointStatefulSet
, named web, has a Spec that indicates that 3 replicas of the nginx container will be launched in unique Pods.volumeClaimTemplates
will provide stable storage using PersistentVolumes provisioned by a PersistentVolume Provisioner.apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: gcr.io/google_containers/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: my-storage-class
resources:
requests:
storage: 1Gi複製代碼
Cluster Domain | Service (ns/name) | StatefulSet (ns/name) | StatefulSet Domain | Pod DNS | Pod Hostname | |
---|---|---|---|---|---|---|
cluster.local | default/nginx | default/web | nginx.default.svc.cluster.local | web-{0..N-1}.nginx.default.svc.cluster.local | web-{0..N-1} | |
cluster.local | foo/nginx | foo/web | nginx.foo.svc.cluster.local | web-{0..N-1}.nginx.foo.svc.cluster.local | web-{0..N-1} | |
kube.local | foo/nginx | foo/web | nginx.foo.svc.kube.local | web-{0..N-1}.nginx.foo.svc.kube.local | web-{0..N-1} |
In Kubernetes 1.7 and later, StatefulSet allows you to relax its ordering guarantees while preserving its uniqueness and identity guarantees via its .spec.podManagementPolicy field.
On Delete;Rolling Updates;Partitions
一個node跑一個pod,做爲一個deamon
When you delete an object, you can specify whether the object’s dependents are also deleted automatically. Deleting dependents automatically is called cascading deletion
.There are two modes of cascading deletion: background and foreground.
前臺刪除:根對象首先進入 「刪除中」 狀態。=> 垃圾收集器會刪除對象的全部 Dependent。 => 刪除 Owner 對象。
後臺刪除:Kubernetes 會當即刪除 Owner 對象,而後垃圾收集器會在後臺刪除這些 Dependent。
Deployments必須使用propagationPolicy: Foreground
自定義資源目前不支持垃圾回收
To control the cascading deletion policy, set the deleteOptions.propagationPolicy field on your owner object. Possible values include 「Orphan」, 「Foreground」, or 「Background」.
The default garbage collection policy for many controller resources is orphan, including ReplicationController, ReplicaSet, StatefulSet, DaemonSet, and Deployment.
todo
todo
這個優勢像effective k8s了:
todo
略
Resource requests and limits
todo
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
disktype: ssd複製代碼
kubernetes.io/hostname
failure-domain.beta.kubernetes.io/zone
failure-domain.beta.kubernetes.io/region
beta.kubernetes.io/instance-type
beta.kubernetes.io/os
beta.kubernetes.io/arch複製代碼
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
containers:
- name: with-node-affinity
image: gcr.io/google_containers/pause:2.0複製代碼
Node affinity, 寫在pod上,描述但願什麼node。
實現:Contiv,Contrail,Flannel,GCE,L2 networks and linux bridging,Nuage,OpenVSwitch,OVN,Calico,Romana,Weave Net
Kubernetes audit is part of kube-apiserver logging all requests coming to the server.
The kubelet can pro-actively monitor for and prevent against total starvation of a compute resource. In those cases, the kubelet can pro-actively fail one or more pods in order to reclaim the starved resource. When the kubelet fails a pod, it terminates all containers in the pod, and the PodPhase is transitioned to Failed.
Eviction Thresholds:
soft eviction threshold
pairs an eviction threshold with a required administrator specified grace periodhard eviction threshold
has no grace period, and if observed, the kubelet will take immediate action to reclaim the associated starved resourceFederation makes it easy to manage multiple clusters. It does so by providing 2 major building blocks:
- Sync resources across clusters: Federation provides the ability to keep resources in multiple clusters in sync. This can be used, for example, to ensure that the same deployment exists in multiple clusters.
- Cross cluster discovery: It provides the ability to auto-configure DNS servers and load balancers with backends from all clusters. This can be used, for example, to ensure that a global VIP or DNS record can be used to access backends from multiple clusters.
Setting up Cluster Federation with Kubefed
Rescheduler ensures that critical add-ons are always scheduled. If the scheduler determines that no node has enough free resources to run the critical add-on pod given the pods that are already running in the cluster the rescheduler tries to free up space for the add-on by evicting some pods; then the scheduler will schedule the add-on pod.
能夠設置一個臨時的taint "CriticalAddonsOnly",只用來部署Critical Add-On Pod,防止其餘pod調度上去
Static pods are managed directly by kubelet daemon on a specific node, without API server observing it. It does not have associated any replication controller, kubelet daemon itself watches it and restarts it when it crashes. There is no health check though. Static pods are always bound to one kubelet daemon and always run on the same node with it.
Kubelet automatically creates so-called mirror pod
on Kubernetes API server for each static pod, so the pods are visible
there, but they cannot be controlled from the API server
.
If you are running clustered Kubernetes and are using static pods to run a pod on every node, you should probably be using a DaemonSet!
能夠經過--pod-manifest-path 或者 --manifest-url設置
Safe sysctl
: In addition to proper namespacing a safe sysctl must be properly isolated between pods on the same node.
//訪問restapi 方式
// 1. proxy
kubectl proxy --port=8083 &
curl localhost:8083/api
// 2.直接訪問
$ APISERVER=$(kubectl config view | grep server | cut -f 2- -d ":" | tr -d " ")
$ TOKEN=$(kubectl describe secret $(kubectl get secrets | grep default | cut -f1 -d ' ') | grep -E '^token' | cut -f2 -d':' | tr -d '\t')
$ curl $APISERVER/api --header "Authorization: Bearer $TOKEN" --insecure複製代碼
several options for connecting to nodes, pods and services from outside the cluster:
//Discovering builtin services
kubectl cluster-info複製代碼
Kubernetes proxy種類
Services
Endpoints API
that is updated whenever the set of Pods in a Service changes. For non-native applications, Kubernetes offers a virtual-IP-based bridge to Services which redirects to the backend Podstype: ExternalName
轉發流量到external service ClusterIP(default)
, NodePort
(會在每一個node上都開一個端口->service), LoadBalancer
(依賴iaas,會有一個EXTERNAL-IP), ExternalName
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: MyApp
ports:
- protocol: TCP
port: 80 # service暴露的port
targetPort: 9376 #默認 = port 指向的port複製代碼
kind: Service
apiVersion: v1
metadata:
name: my-service
namespace: prod
spec:
type: ExternalName
externalName: my.database.example.com複製代碼
tutorial
tutorial
tutorial
kubectl exec -ti busybox -- nslookup kubernetes.default複製代碼