「本文已參與好文召集令活動,點擊查看:後端、大前端雙賽道投稿,2萬元獎池等你挑戰!」前端
隨着 Kubernetes 的普遍使用,如何保證集羣穩定運行,成爲了開發和運維團隊關注的焦點。在集羣中部署應用時,像忘記配置資源請求或忘記配置限制這樣簡單的事情可能就會破壞自動伸縮,甚至致使工做負載耗盡資源。這樣種種的配置問題經常致使生產中斷,爲了不它們咱們用 Polaris 來預防。Polaris是fairwinds開發的一款開源的kubernetes集羣健康檢查組件。經過分析集羣中的部署配置,從而發現並避免影響集羣穩定性、可靠性、可伸縮性和安全性的配置問題。python
Polaris是一款經過分析部署配置,從而發現集羣中存在的問題的健康檢查組件。固然,Polaris的目標可不單單只是發現問題,同時也提供避免問題的解決方案,確保集羣處於健康狀態。下面將會介紹Polaris的主要功能: Polaris 包含3個組件,分別實現了不一樣的功能:react
Dashboard是polaris提供的可視化工具,能夠查看Kubernetes workloads狀態的概覽以及優化點。也能夠按類別、名稱空間和工做負載查看。nginx
# kubectl apply -f https://github.com/fairwindsops/polaris/releases/latest/download/dashboard.yaml
# kubectl port-forward --namespace polaris svc/polaris-dashboard 8080:80
複製代碼
按類別查看檢查結果git
按名稱空間查看檢查結果github
polaris dashboard --port 8080 --audit-path=/Users/mervinwang/Tencent/Code/Kubernetes/app/nginx
複製代碼
Polaris能夠做爲一個admission controller運行,做爲一個validating webhook。它接受與儀表板相同的配置,並能夠運行相同的驗證。這個webhook將拒絕任何觸發驗證錯誤的workloads 。這代表了Polaris更大的目標,不單單是經過儀表板的可見性來鼓勵更好的配置,而是經過這個webhook來實際執行它。Polaris不會修復workloads,只會阻止他們。web
在命令行上也可使用Polaris來審計本地文件或正在運行的集羣。這對於在CI/CD管道的基礎設施代碼上運行Polaris特別有幫助。若是Polaris給出的審計分數低於某個閾值,或者出現任何錯誤,可以使用命令行標誌來致使CI/CD失敗。json
polaris支持kubectl
, helm
and local binary
三種安裝方式,本文選擇最簡單的安裝方式,分別介紹三個組件的安裝後端
Helm安全
添加helm charts倉庫
helm repo add reactiveops-stable https://charts.reactiveops.com/stable
複製代碼
更新charts倉庫並安裝Dashboard組件
helm upgrade --install polaris reactiveops-stable/polaris --namespace polaris
複製代碼
若是須要在本地查看Dashboard儀表盤,可使用如下命令,進行本地端口轉發
kubectl port-forward --namespace polaris svc/polaris-dashboard 8080:80
複製代碼
在集羣中安裝Webhook組件後,將會阻止不符合標準的應用部署在集羣中。
helm
添加helm charts倉庫
helm repo add reactiveops-stable https://charts.reactiveops.com/stable
複製代碼
更新charts倉庫並安裝Webhook組件
helm upgrade --install polaris reactiveops-stable/polaris --namespace polaris \
--set webhook.enable=true --set dashboard.enable=false
複製代碼
若是須要在本地測試polaris,能夠下載二進制文件安裝 releases page,也可使用 Homebrew安裝:
brew tap reactiveops/tap
brew install reactiveops/tap/polaris
polaris --version
複製代碼
使用CLI檢查本地配置文件
polaris --audit --audit-path ./deploy/
複製代碼
能夠將掃描結果保存到yaml文件中
polaris --audit --output-format yaml > report.yaml
複製代碼
上面簡單的介紹了,polaris的安裝與基本使用。可是,若是要根據咱們項目的實際狀況來結合polaris,使用默認配置就不能知足需求了。因此咱們還須要知道如何定義polaris檢查規則的配置文件,實現自定義配置。 在自定義配置polaris以前,咱們須要先了解一下polaris檢查的等級以及支持的檢查類型。 polaris檢查的嚴重等級分爲error
、warning
和ignore
,polaris不會檢查ignore
等級的配置項。 polaris支持的檢查類型有:Health Checks
、Images
、Networking
、Resources
、Security
,下面咱們將一一介紹:
Polaris 支持校驗pods中是否存在readiness和liveiness探針
key | default | description |
---|---|---|
readinessProbeMissing |
warning |
沒有爲Pod配置readiness 探針時失敗 |
livenessProbeMissing |
warning |
沒有爲Pod配置liveness 探針時失敗 |
tagNotSpecified |
danger |
沒有爲鏡像指定tag或者指定tag爲latest 時失敗 |
pullPolicyNotAlways |
warning |
當鏡像拉取策略不是 always 時失敗 |
priorityClassNotSet |
ignore |
當沒有爲Pod配置priorityClassName 時失敗 |
multipleReplicasForDeployment |
ignore |
當Deployment 的Replicas 爲1時失敗 |
missingPodDisruptionBudget |
ignore |
polaris支持校驗內存、cpu使用限制是否配置
key | default | description |
---|---|---|
cpuRequestsMissing |
warning |
沒有配置 resources.requests.cpu 時失敗 |
memoryRequestsMissing |
warning |
沒有配置 resources.requests.memory 時失敗 |
cpuLimitsMissing |
warning |
沒有配置 resources.limits.cpu 時失敗 |
memoryLimitsMissing |
warning |
沒有配置 resources.limits.memory 時失敗 |
對於內存、cpu等資源配置,還能夠配置範圍檢查。只有當配置在指定區間內才能夠經過檢查。
limits:
type: object
required:
- memory
- cpu
properties:
memory:
type: string
resourceMinimum: 100M
resourceMaximum: 6G
cpu:
type: string
resourceMinimum: 100m
resourceMaximum: "2"
複製代碼
key | default | description |
---|---|---|
hostIPCSet |
danger |
Fails when hostIPC attribute is configured. |
hostPIDSet |
danger |
Fails when hostPID attribute is configured. |
notReadOnlyRootFilesystem |
warning |
Fails when securityContext.readOnlyRootFilesystem is not true. |
privilegeEscalationAllowed |
danger |
Fails when securityContext.allowPrivilegeEscalation is true. |
runAsRootAllowed |
warning |
Fails when securityContext.runAsNonRoot is not true. |
runAsPrivileged |
danger |
Fails when securityContext.privileged is true. |
insecureCapabilities |
warning |
Fails when securityContext.capabilities includes one of the capabilities listed here(opens new window) |
dangerousCapabilities |
danger |
Fails when securityContext.capabilities includes one of the capabilities listed here(opens new window) |
hostNetworkSet |
warning |
Fails when hostNetwork attribute is configured. |
hostPortSet |
warning |
Fails when hostPort attribute is configured. |
tlsSettingsMissing |
warning |
Fails when an Ingress lacks TLS settings. |
根據上文的介紹,咱們已經能夠根據項目的實際狀況,定義本身的掃描配置。若是以爲polaris提供的檢查規則不知足需求的話,咱們還能夠自定義檢查規則。 好比:咱們能夠自定義規則檢查鏡像來源,當鏡像來自quay.io拋出警告
checks:
imageRegistry: warning
customChecks:
imageRegistry:
successMessage: Image comes from allowed registries
failureMessage: Image should not be from disallowed registry
category: Images
target: Container # target can be "Container" or "Pod"
schema:
'$schema': http://json-schema.org/draft-07/schema
type: object
properties:
image:
type: string
not:
pattern: ^quay.io
複製代碼
也能夠指定檢查項
checks:
cpuRequestsMissing: danger
memoryRequestsMissing: danger
cpuLimitsMissing: danger
memoryLimitsMissing: danger
複製代碼
polaris audit -c check_config.yaml --.......
複製代碼
{
"PolarisOutputVersion": "1.0",
"AuditTime": "2021-07-01T15:07:00+08:00",
"SourceType": "Path",
"SourceName": "/Users/mervinwang/Tencent/Code/Kubernetes/app/nginx",
"DisplayName": "/Users/mervinwang/Tencent/Code/Kubernetes/app/nginx",
"ClusterInfo": {
"Version": "unknown",
"Nodes": 0,
"Pods": 0,
"Namespaces": 0,
"Controllers": 1
},
"Results": [
{
"Name": "nginx-config",
"Namespace": "",
"Kind": "ConfigMap",
"Results": {},
"PodResult": null,
"CreatedTime": "0001-01-01T00:00:00Z"
},
{
"Name": "nginx-deployment",
"Namespace": "",
"Kind": "Deployment",
"Results": {},
"PodResult": {
"Name": "",
"Results": {},
"ContainerResults": [
{
"Name": "nginx",
"Results": {
"cpuLimitsMissing": {
"ID": "cpuLimitsMissing",
"Message": "CPU limits should be set",
"Details": null,
"Success": false,
"Severity": "danger",
"Category": "Efficiency"
},
"cpuRequestsMissing": {
"ID": "cpuRequestsMissing",
"Message": "CPU requests should be set",
"Details": null,
"Success": false,
"Severity": "danger",
"Category": "Efficiency"
},
"memoryLimitsMissing": {
"ID": "memoryLimitsMissing",
"Message": "Memory limits should be set",
"Details": null,
"Success": false,
"Severity": "danger",
"Category": "Efficiency"
},
"memoryRequestsMissing": {
"ID": "memoryRequestsMissing",
"Message": "Memory requests should be set",
"Details": null,
"Success": false,
"Severity": "danger",
"Category": "Efficiency"
}
}
}
]
},
"CreatedTime": "0001-01-01T00:00:00Z"
}
],
"Score": 0
}
複製代碼
當對一個集羣運行Pollaris檢查後,返回的結果是json,不夠直觀,咱們使用Python,處理結果後輸出到excel表格中,方便查看
import yaml
import os
import xlsxwriter
# config
fileNamePath = os.path.split(os.path.realpath(__file__))[0]
config = os.path.join(fileNamePath,'check_config.yaml')
cluster_config = os.path.join(fileNamePath,'cluster_list.yaml')
# variable
scan_controller_type = ["Deployment", "DaemonSet", "StatefulSet"]
def read_cluster():
f = open(cluster_config,'r',encoding='utf-8')
cont = f.read()
return yaml.load(cont, Loader=yaml.FullLoader)
def generate_report(cluster_id: str):
scan_command = f"polaris audit -c {config} --kubeconfig ~/.kube/config --only-show-failed-tests true --output-file result/{cluster_id}.yaml"
try:
os.system(scan_command)
except Exception as e:
print(e)
def format_data(cluster):
cluster_report = os.path.join(fileNamePath, 'result/{}.yaml'.format(cluster))
f = open(cluster_report, 'r', encoding='utf-8')
cont = f.read()
x = yaml.load(cont, Loader=yaml.FullLoader)
data_result = x["Results"]
data_list = []
for item in data_result:
if item["Kind"] in scan_controller_type and item['PodResult']["ContainerResults"][0]["Results"]:
pod_scan_result = []
for pod_result in item['PodResult']["ContainerResults"]:
pod_name = pod_result["Name"]
pod_scan_result.append([item for item in pod_result["Results"]])
obj = [cluster, item["Kind"], item["Namespace"], item["Name"], pod_name, str(pod_scan_result[0])]
data_list.append(obj)
return data_list
def excel_config(workbook):
column_name = ['ClusterID', 'Kind', 'NameSpace', 'Name', 'PodName', 'Scan Result']
merge_format = workbook.add_format({
'font_size': 22,
'bold': True,
'font_color': '#FFFFFF',
'border': 1,
'font_name':u'蘋方-簡',
'align': 'center',
'valign': 'vcenter',
'fg_color': '#0174DF'
})
Title_format = workbook.add_format({
'font_size': 18,
'border': 1,
'bold': True,
'align': 'center',
'font_name': u'蘋方-簡',
'valign': 'vcenter',
})
data_format = workbook.add_format({
'font_size': 16,
'border': 1,
'align': 'center',
'font_name': u'蘋方-簡',
'valign': 'vcenter',
})
return column_name, merge_format, Title_format, data_format
def generate_excel():
workbook = xlsxwriter.Workbook("scan_result.xlsx")
column_name, merge_format, Title_format, data_format = excel_config(workbook)
for cluster in read_cluster()["clusters"]:
print(f"Scan cluster start: {cluster}")
generate_report(cluster)
worksheet = workbook.add_worksheet(cluster)
worksheet.merge_range('A1:F1', f'集羣 {cluster} Requests/Limits 掃描結果', merge_format)
worksheet.set_column('A:F', 35)
worksheet.set_column('F:F', 130)
worksheet.set_row(0, 50)
global ECSNUM
ECSNUM = 3
scan_result = format_data(cluster)
if scan_result != None:
worksheet.write_row('A2', column_name, Title_format)
# 若是結不爲空,則表明有資源,則寫入數據
for item in scan_result:
worksheet.write_row('A' + str(ECSNUM), item, data_format)
ECSNUM += 1
# 不然,表明該地域無資源,寫入 NULL
else:
worksheet.merge_range('A3:F3', 'NOT Found INFO', data_format)
workbook.close()
if __name__ == '__main__':
generate_excel()
複製代碼