文章拷於:http://blog.itpub.net/28916011/viewspace-2216748/ 用於本身備份記錄錯誤html
一個完整的k8s集羣,應該包含以下六大部分:kube-dns、ingress-controller、metrics server監控系統、dashboard、存儲和EFK日誌系統。node
咱們的日誌系統要部署在k8s集羣以外,這樣即便整個k8s集羣宕機了,咱們還能從外置的日誌系統查看到k8s宕機前的日誌。 python
另外,咱們生產部署的日誌系統要單獨放在一個存儲捲上。 這裏咱們爲了方便,本次測試關閉了日誌系統的存儲卷功能。docker
一、添加incubator源(這個源是開發版的安裝包,用起來可能不穩定) shell
訪問https://hub.kubeapps.com/chartsjson
[root@master ~]# helm repo list NAME URL local http://127.0.0.1:8879/charts stablehttps://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
[root@master efk]# helm repo add incubator https://kubernetes-charts-incubator.storage.googleapis.com "incubator" has been added to your repositories
[root@master efk]# helm repo list NAME URL local http://127.0.0.1:8879/charts stable https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts incubatorhttps://kubernetes-charts-incubator.storage.googleapis.com
二、下載elasticsearch vim
[root@master efk]# helm fetch incubator/elasticsearch [root@master efk]# ls elasticsearch-1.10.2.tgz [root@master efk]# tar -xvf elasticsearch-1.10.2.tgz
三、關閉存儲卷(生產上不要關,咱們這裏爲了測試方便才關的)api
[root@master efk]# vim elasticsearch/values.yaml 把 persistence: enabled: true 改爲 persistence: enabled: false 有兩處須要改
上面咱們關閉了存儲卷的功能,而改用本地目錄來存儲日誌。 瀏覽器
四、建立單獨的名稱空間 bash
[root@master efk]# kubectl create namespace efk namespace/logs created
[root@master efk]# kubectl get ns NAME STATUS AGE ekf Active 13s
五、把elasticsearch安裝在efk名稱空間中 ,我沒用這個方式,用的下面的
[root@master efk]# helm install --name els1 --namespace=efk -f elasticsearch/values.yaml incubator/elasticsearch NAME: els1 LAST DEPLOYED: Thu Oct 18 01:59:15 2018 NAMESPACE: efk STATUS: DEPLOYED RESOURCES: ==> v1/Pod(related) NAME READY STATUS RESTARTS AGE els1-elasticsearch-client-58899f6794-gxn7x 0/1 Pending 0 0s els1-elasticsearch-client-58899f6794-mmqq6 0/1 Pending 0 0s els1-elasticsearch-data-0 0/1 Pending 0 0s els1-elasticsearch-master-0 0/1 Pending 0 0s ==> v1/ConfigMap NAME DATA AGE els1-elasticsearch 4 1s ==> v1/Service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE els1-elasticsearch-client ClusterIP 10.103.147.142 <none> 9200/TCP 0s els1-elasticsearch-discovery ClusterIP None <none> 9300/TCP 0s ==> v1beta1/Deployment NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE els1-elasticsearch-client 2 0 0 0 0s ==> v1beta1/StatefulSet NAME DESIRED CURRENT AGE els1-elasticsearch-data 2 1 0s els1-elasticsearch-master 3 1 0s NOTES: The elasticsearch cluster has been installed. *** Please note that this chart has been deprecated and moved to stable. Going forward please use the stable version of this chart. *** Elasticsearch can be accessed: * Within your cluster, at the following DNS name at port 9200: els1-elasticsearch-client.efk.svc * From outside the cluster, run these commands in the same shell: export POD_NAME=$(kubectl get pods --namespace efk -l "app=elasticsearch,component=client,release=els1" -o jsonpath="{.items[0].metadata.name}") echo "Visit http://127.0.0.1:9200 to use Elasticsearch" kubectl port-forward --namespace efk $POD_NAME 9200:9200
說明:--name els1是chart部署後的release名字,名字本身隨便取就行。
上面咱們是經過values.yaml文件在線安裝的els。可是咱們已經下載els安裝包了,也能夠經過下載的els包進行離線安裝,以下:
[root@master efk]# ls elasticsearch elasticsearch-1.10.2.tgz
root@master efk]# helm install --name els1 --namespace=efk ./elasticsearch
說明:./elasticsearch就是當前els安裝包目錄的名字。
安裝完後,咱們就能在efk名稱空間中看到相應的pods資源了(我在安裝elasticsearch時,當時是安裝不上的,由於說是打不開elasticseartch的官網,也就是不能再這個官網下載鏡像,後來我就放置了兩天沒管,再登陸上看,發現鏡像居然本身下載好了,真是有意思)我是第一次沒安裝成,機器內存小了,而後又加的內存。刪掉了這個namespace次日又從新建的,而後一次成功了。
故障二: els1-...client 出現了READY 0/1的狀態,describe發現健康檢查沒經過,進到pod裏去ping其餘node的pod發現ping不通了,其餘pod也是這種狀況,不一樣node之間的pod是不通的,ip route show 有問題,參照flannel那一章 。很奇葩,沒找到問題,flannel是正常的。但就是不通。刪了flannel又從新搞了一次flannel 最後就行了
[root@master efk]# kubectl get pods -n efk -o wide NAME READY STATUS RESTARTS AGE IP NODE els1-elasticsearch-client-78b54979c5-kzj7z 1/1 Running 2 1h 10.244.2.157 node2 els1-elasticsearch-client-78b54979c5-xn2gb 1/1 Running 1 1h 10.244.2.151 node2 els1-elasticsearch-data-0 1/1 Running 0 1h 10.244.1.165 node1 els1-elasticsearch-data-1 1/1 Running 0 1h 10.244.2.169 node2 els1-elasticsearch-master-0 1/1 Running 0 1h 10.244.1.163 node1 els1-elasticsearch-master-1 1/1 Running 0 1h 10.244.2.168 node2 els1-elasticsearch-master-2 1/1 Running 0 57m 10.244.1.170 node1
查看安裝好的release:
[root@master efk]# helm list NAME REVISIONUPDATED STATUS CHART NAMESPACE els1 1 Thu Oct 18 23:11:54 2018DEPLOYEDelasticsearch-1.10.2efk
查看els1的狀態:
[root@k8s-master1 ~]# helm status els1 * Within your cluster, at the following DNS name at port 9200: els1-elasticsearch-client.efk.svc ##這個就是els1 service的主機名 * From outside the cluster, run these commands in the same shell: export POD_NAME=$(kubectl get pods --namespace efk -l "app=elasticsearch,component=client,release=els1" -o jsonpath="{.items[0].metadata.name}") echo "Visit http://127.0.0.1:9200 to use Elasticsearch" kubectl port-forward --namespace efk $POD_NAME 9200:9200
cirror是專門爲測試虛擬環境的客戶端,它能夠快速建立一個kvm的虛擬機,一共才幾兆的大小,並且裏面提供的工具仍是比較完整的。
下面咱們運行cirror:
[root@k8s-master1 ~]# kubectl run cirror-$RANDOM --rm -it --image=cirros -- /bin/sh kubectl run --generator=deployment/apps.v1beta1 is DEPRECATED and will be removed in a future version. Use kubectl create instead. If you don't see a command prompt, try pressing enter. / # / # nslookup els1-elasticsearch-client.efk.svc Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: els1-elasticsearch-client.efk.svc Address 1: 10.103.105.170 els1-elasticsearch-client.efk.svc.cluster.local
由於我刪除namespace後出現了其餘問題,最後重啓了集羣,沒想到flannel又雞巴的出問題了,routing模式又沒了,而後nslookup一直出問題,我還覺得個人版本又出了問題,最後刪了flannel又從新生成
-rm:表示退出咱們就直接刪除掉
-it:表示交互式登陸
上面咱們看到els1-elasticsearch-client.efk.svc服務名解析出來的ip地址。
下面咱們再訪問http:els1-elasticsearch-client.efk.svc:9200 頁面:
/ # curl els1-elasticsearch-client.efk.svc:9200 curl: (6) Couldn't resolve host 'els1-elasticsearch-client.efk.svc' / # / # curl els1-elasticsearch-client.efk.svc.cluster.local:9200 { "name" : "els1-elasticsearch-client-b898c9d47-5gwzq", "cluster_name" : "elasticsearch", "cluster_uuid" : "RFiD2ZGWSAqM2dF6wy24Vw", "version" : { "number" : "6.4.2", "build_flavor" : "oss", "build_type" : "tar", "build_hash" : "04711c2", "build_date" : "2018-09-26T13:34:09.098244Z", "build_snapshot" : false, "lucene_version" : "7.4.0", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search" }
看裏面的內容:
/ # curl els1-elasticsearch-client.efk.svc.cluster.local:9200/_cat =^.^= /_cat/allocation /_cat/shards /_cat/shards/{index} /_cat/master /_cat/nodes /_cat/tasks /_cat/indices /_cat/indices/{index} /_cat/segments /_cat/segments/{index} /_cat/count /_cat/count/{index} /_cat/recovery /_cat/recovery/{index} /_cat/health /_cat/pending_tasks /_cat/aliases /_cat/aliases/{alias} /_cat/thread_pool /_cat/thread_pool/{thread_pools} /_cat/plugins /_cat/fielddata /_cat/fielddata/{fields} /_cat/nodeattrs /_cat/repositories /_cat/snapshots/{repository} /_cat/templates
看有幾個節點:
/ # curl els1-elasticsearch-client.efk.svc.cluster.local:9200/_cat/nodes 10.244.2.104 23 95 0 0.00 0.02 0.05 di - els1-elasticsearch-data-0 10.244.4.83 42 99 1 0.01 0.11 0.13 mi * els1-elasticsearch-master-1 10.244.4.81 35 99 1 0.01 0.11 0.13 i - els1-elasticsearch-client-b898c9d47-5gwzq 10.244.4.84 31 99 1 0.01 0.11 0.13 mi - els1-elasticsearch-master-2 10.244.2.105 35 95 0 0.00 0.02 0.05 i - els1-elasticsearch-client-b898c9d47-shqd2 10.244.4.85 18 99 1 0.01 0.11 0.13 di - els1-elasticsearch-data-1 10.244.4.82 40 99 1 0.01 0.11 0.13 mi - els1-elasticsearch-master-0
六、把fluentd安裝在efk空間中
[root@k8s-master1 ~]# helm fetch incubator/fluentd-elasticsearch
[root@k8s-master1 ~]# tar -xvf fluentd-elasticsearch-0.7.2.tgz
[root@k8s-master1 ~]# cd fluentd-elasticsearch
[root@k8s-master1 fluentd-elasticsearch]# vim values.yaml
一、改其中的host: 'elasticsearch-client',改爲host: 'els1-elasticsearch-client.efk.svc.cluster.local'表示到哪找咱們的elasticsearch服務。
二、改tolerations污點,表示讓k8s master也能接受部署fluentd pod,這樣才能收集主節點的日誌:
把
tolerations: {}
# - key: node-role.kubernetes.io/master
# operator: Exists
# effect: NoSchedule
改爲
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
三、改annotations,這樣也就能收集監控prometheus的日誌了
把
annotations: {}
# prometheus.io/scrape: "true"
# prometheus.io/port: "24231"
大概就這個位置個人annotations下面是PodAnnktation的位置,底下改的。改爲
annotations:
prometheus.io/scrape: "true" prometheus.io/port: "24231"
同時把
service: {}
# type: ClusterIP
# ports:
# - name: "monitor-agent"
# port: 24231
改爲
service:
type: ClusterIP
ports:
- name: "monitor-agent"
port: 24231
這樣經過service 24231找監控prometheus的fluentd
開始安裝fluentd:
[root@k8s-master1 fluentd-elasticsearch]# helm install --name fluentd1 --namespace=efk -f values.yaml ./
[root@k8s-master1 fluentd-elasticsearch]# helm list NAME REVISIONUPDATED STATUS CHART NAMESPACE els1 1 Sun Nov 4 09:37:35 2018DEPLOYEDelasticsearch-1.10.2 efk fluentd11 Tue Nov 6 09:28:42 2018DEPLOYEDfluentd-elasticsearch-0.7.2efk
[root@k8s-master1 fluentd-elasticsearch]# kubectl get pods -n efk NAME READY STATUS RESTARTS AGE els1-elasticsearch-client-b898c9d47-5gwzq 1/1 Running 0 47h els1-elasticsearch-client-b898c9d47-shqd2 1/1 Running 0 47h els1-elasticsearch-data-0 1/1 Running 0 47h els1-elasticsearch-data-1 1/1 Running 0 45h els1-elasticsearch-master-0 1/1 Running 0 47h els1-elasticsearch-master-1 1/1 Running 0 45h els1-elasticsearch-master-2 1/1 Running 0 45h fluentd1-fluentd-elasticsearch-9k456 1/1 Running 0 2m28s fluentd1-fluentd-elasticsearch-dcnsc 1/1 Running 0 2m28s fluentd1-fluentd-elasticsearch-p5h88 1/1 Running 0 2m28s fluentd1-fluentd-elasticsearch-sdvn9 1/1 Running 0 2m28s fluentd1-fluentd-elasticsearch-ztm9s 1/1 Running 0 2m28s
七、把kibanna安裝在efk空間中
注意,安裝kibana的版本號必定要和elasticsearch的版本號一致,不然兩者沒法結合起來。我剛開始沒在乎,後面看日誌發現的確有這個問題的存在,而後把kibanna的value.yaml改爲了和es一致的版本,兩個版本能夠從value裏看到
[root@k8s-master1 ~]# helm fetch stable/kibana [root@k8s-master1 ~]# ls kibana-0.2.2.tgz
[root@k8s-master1 ~]# tar -xvf kibana-0.2.2.tgz [root@k8s-master1 ~]# cd kibana
修改ELASTICSEARCH_URL,把type改爲NodePort類型
[root@master kibana]# cat values.yaml |more image: repository: "docker.elastic.co/kibana/kibana-oss" tag: "6.4.2" pullPolicy: "IfNotPresent" env: # All Kibana configuration options are adjustable via env vars. # To adjust a config option to an env var uppercase + replace `.` with `_` # Ref: https://www.elastic.co/guide/en/kibana/current/settings.html # ELASTICSEARCH_URL: http://els1-elasticsearch-client.efk.svc:9200 #SERVER_PORT: 9200 # LOGGING_VERBOSE: "true" # SERVER_DEFAULTROUTE: "/app/kibana" service: type: NodePort externalPort: 443 internalPort: 5601 ## External IP addresses of service ## Default: nil ## # externalIPs: # - 192.168.0.1
開始部署kibana:
[root@k8s-master1 kibana]# helm install --name=kib1 --namespace=efk -f values.yaml ./ ==> v1/Service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kib1-kibana NodePort 10.108.188.4 <none> 443:31865/TCP 0s
[root@k8s-master1 kibana]# kubectl get svc -n efk NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE els1-elasticsearch-client ClusterIP 10.103.105.170 <none> 9200/TCP 2d22h els1-elasticsearch-discovery ClusterIP None <none> 9300/TCP 2d22h kib1-kibana NodePort 10.108.188.4 <none> 443:31865/TCP 4m27s
[root@k8s-master1 kibana]# kubectl get pods -n efk NAME READY STATUS RESTARTS AGE els1-elasticsearch-client-b898c9d47-5gwzq 1/1 Running 0 2d22h els1-elasticsearch-client-b898c9d47-shqd2 1/1 Running 0 2d22h els1-elasticsearch-data-0 1/1 Running 0 22h els1-elasticsearch-data-1 1/1 Running 0 22h els1-elasticsearch-master-0 1/1 Running 0 2d22h els1-elasticsearch-master-1 1/1 Running 0 2d19h els1-elasticsearch-master-2 1/1 Running 0 2d19h fluentd1-fluentd-elasticsearch-9k456 1/1 Running 0 22h fluentd1-fluentd-elasticsearch-dcnsc 1/1 Running 0 22h fluentd1-fluentd-elasticsearch-p5h88 1/1 Running 0 22h fluentd1-fluentd-elasticsearch-sdvn9 1/1 Running 0 22h fluentd1-fluentd-elasticsearch-ztm9s 1/1 Running 0 22h kib1-kibana-68f9fbfd84-pt2dt 0/1 Running 0 9m59s #這個鏡像若是下載不下來,多等幾天就下載下來了,我是一下就下載下來了
而後找個瀏覽器,打開宿主機ip:nodeport
不過我這個打開的頁面有錯誤,作以下操做便可:我那個時候也有錯,有時能刪掉,有時刪也刪不掉,就沒管他,最後登陸正常
[root@k8s-master1 ~]# kubectl get pods -n efk |grep ela els1-elasticsearch-client-b898c9d47-8pntr 1/1 Running 1 43h els1-elasticsearch-client-b898c9d47-shqd2 1/1 Running 1 5d13h els1-elasticsearch-data-0 1/1 Running 0 117m els1-elasticsearch-data-1 1/1 Running 0 109m els1-elasticsearch-master-0 1/1 Running 1 2d11h els1-elasticsearch-master-1 1/1 Running 0 14h els1-elasticsearch-master-2 1/1 Running 0 14h [root@k8s-master1 ~]# kubectl exec -it els1-elasticsearch-client-b898c9d47-shqd2 -n efk -- /bin/bash 刪除elasticsearch下的.kibana便可 [root@els1-elasticsearch-client-b898c9d47-shqd2 elasticsearch]# curl -XDELETE http://els1-elasticsearch-client.efk.svc:9200/.kibana
最終,看到咱們作出了EFK的日誌收集系統