Jaeger:開源的端到端分佈式跟蹤,監視複雜的分佈式系統中的事務並進行故障排除。
下圖對比了經常使用的開源全鏈路追蹤方案,目前SkyWalking和Pinpoint使用比較多,Jaeger相比客戶端支持語言比較多,特別是對C++的支持,因此此次選擇測試下。
git
Agent
5775 UDP協議,接收兼容zipkin的協議數據
6831 UDP協議,接收兼容jaeger的兼容協議
6832 UDP協議,接收jaeger的二進制協議
5778 HTTP協議,數據量大不建議使用github
Collector
14267 tcp agent發送jaeger.thrift格式數據
14250 tcp agent發送proto格式數據(背後gRPC)
14268 http 直接接受客戶端數據
14269 http 健康檢查docker
Query
16686 http jaeger的前端,放給用戶的接口
16687 http 健康檢查數據庫
1.建立命名空間後端
[root@VM-0-123-centos jaeger]# kubectl create namespace jaeger
2.部署Jaeger-Operator
Jaeger Operator:Jaeger Operator for Kubernetes簡化了在Kubernetes上的部署和運行Jaeger。
Jaeger Operator是Kubernetes operator的實現。操做員是一種軟件,能夠減輕運行另外一軟件的操做複雜性。從技術上講,操做員是打包,部署和管理Kubernetes應用程序的一種方法。
Jaeger Operator版本跟蹤Jaeger組件(查詢,收集器,代理)的一種版本。發行新版本的Jaeger組件時,將發行新版本的操做員,該操做員瞭解如何將先前版本的運行實例升級到新版本。centos
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/crds/jaegertracing.io_jaegers_crd.yaml [root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/service_account.yaml [root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role.yaml [root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role_binding.yaml [root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/operator.yaml
查看狀態api
[root@VM-0-123-centos jaeger]# kubectl get all -n jaeger NAME READY STATUS RESTARTS AGE pod/jaeger-operator-6ff67bdd4b-4nffk 1/1 Running 0 14d pod/simple-prod-collector-59fc47bf5c-h26mq 0/1 Terminating 0 9d NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/jaeger-operator-metrics ClusterIP 172.20.253.138 <none> 8383/TCP,8686/TCP 14d NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/jaeger-operator 1/1 1 1 14d NAME DESIRED CURRENT READY AGE replicaset.apps/jaeger-operator-6ff67bdd4b 1 1 1 14d
3.建立jaeger實例
建立jaeger.yaml文件,配置ES集羣及限制Deployment/simple-prod-collector容器的cpu和內存使用大小。最大數量能夠起10個pod。架構
apiVersion: jaegertracing.io/v1 kind: Jaeger metadata: name: simple-prod spec: strategy: production storage: type: elasticsearch options: es: server-urls: http://10.0.16.3:9200 index-prefix: zhjt collector: maxReplicas: 10 resources: limits: cpu: 500m memory: 512Mi
[root@VM-0-123-centos jaeger]# kubectl apply -f jaeger.yaml -n jaeger jaeger.jaegertracing.io/simple-prod created
列出jaeger對象
備註:貌似使用官網all in one的例子狀態是正常的Running,這裏狀態雖然是Failed,可是不影響使用。app
[root@VM-0-123-centos jaeger]# kubectl get jaegers -n jaeger NAME STATUS VERSION STRATEGY STORAGE AGE simple-prod Failed 1.22.0 production elasticsearch 9d
獲取pod名字
[root@VM-0-123-centos jaeger]# kubectl get pods -l app.kubernetes.io/instance=simple-prod -n jaeger NAME READY STATUS RESTARTS AGE simple-prod-collector-59fc47bf5c-h26mq 1/1 Running 0 9d simple-prod-query-85689b7bbd-g5jw9 2/2 Running 0 9d
獲取pod日誌
[root@VM-0-123-centos jaeger]# kubectl logs simple-prod-query-85689b7bbd-g5jw9 jaeger-agent -n jaeger 2021/04/28 04:55:34 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined {"level":"info","ts":1619585734.2081811,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"} {"level":"info","ts":1619585734.2082183,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"} {"level":"info","ts":1619585734.2083232,"caller":"flags/admin.go:105","msg":"Mounting health check on admin server","route":"/"} {"level":"info","ts":1619585734.2083883,"caller":"flags/admin.go:111","msg":"Starting admin HTTP server","http-addr":":14271"} {"level":"info","ts":1619585734.2084124,"caller":"flags/admin.go:97","msg":"Admin server started","http.host-port":"[::]:14271","health-status":"unavailable"} {"level":"info","ts":1619585734.2089527,"caller":"grpc/builder.go:70","msg":"Agent requested insecure grpc connection to collector(s)"} {"level":"info","ts":1619585734.2089992,"caller":"grpc@v1.29.1/clientconn.go:243","msg":"parsed scheme: \"dns\"","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.21038,"caller":"command-line-arguments/main.go:84","msg":"Starting agent"} {"level":"info","ts":1619585734.2104166,"caller":"healthcheck/handler.go:128","msg":"Health Check state change","status":"ready"} {"level":"info","ts":1619585734.2108943,"caller":"grpc/builder.go:108","msg":"Checking connection to collector"} {"level":"info","ts":1619585734.210908,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"IDLE"} {"level":"info","ts":1619585734.211061,"caller":"app/agent.go:69","msg":"Starting jaeger-agent HTTP server","http-port":5778} {"level":"info","ts":1619585734.3344934,"caller":"grpc@v1.29.1/resolver_conn_wrapper.go:143","msg":"ccResolverWrapper: sending update to cc: {[{172.20.0.88:14250 <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.3345578,"caller":"grpc@v1.29.1/clientconn.go:667","msg":"ClientConn switching balancer to \"round_robin\"","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.3345697,"caller":"grpc@v1.29.1/clientconn.go:682","msg":"Channel switches to new LB policy \"round_robin\"","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.3346283,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.33467,"caller":"grpc@v1.29.1/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.88:14250\" to connect","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.334736,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.3347983,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"} {"level":"info","ts":1619585734.335669,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to READY","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.3357751,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[0xc0002f5ea0:{{172.20.0.88:14250 <nil> 0 <nil>}}]}","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.3357947,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to READY","system":"grpc","grpc_log":true} {"level":"info","ts":1619585734.335807,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"READY"} {"level":"info","ts":1619592172.4516647,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.4517512,"caller":"grpc@v1.29.1/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.88:14250\" to connect","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.4517596,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[]}","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.4517772,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.4517884,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"} {"level":"warn","ts":1619592172.4523218,"caller":"grpc@v1.29.1/clientconn.go:1275","msg":"grpc: addrConn.createTransport failed to connect to {172.20.0.88:14250 <nil> 0 <nil>}. Err: connection error: desc = \"transport: Error while dialing dial tcp 172.20.0.88:14250: connect: connection refused\". Reconnecting...","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.4523551,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.452386,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.4523947,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"TRANSIENT_FAILURE"} {"level":"info","ts":1619592172.6118224,"caller":"grpc@v1.29.1/resolver_conn_wrapper.go:143","msg":"ccResolverWrapper: sending update to cc: {[{172.20.0.178:14250 <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.6118581,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.6118758,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to SHUTDOWN","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.611892,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.6119003,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"} {"level":"info","ts":1619592172.6119049,"caller":"grpc@v1.29.1/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.178:14250\" to connect","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.612726,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to READY","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.6127572,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[0xc0003df970:{{172.20.0.178:14250 <nil> 0 <nil>}}]}","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.6127682,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to READY","system":"grpc","grpc_log":true} {"level":"info","ts":1619592172.6127849,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"READY"}
[root@VM-0-123-centos jaeger]# kubectl logs simple-prod-query-85689b7bbd-g5jw9 jaeger-query -n jaeger 2021/04/28 04:55:29 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined {"level":"info","ts":1619585729.8951077,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"} {"level":"info","ts":1619585729.8951416,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"} {"level":"info","ts":1619585729.8952546,"caller":"flags/admin.go:105","msg":"Mounting health check on admin server","route":"/"} {"level":"info","ts":1619585729.8953054,"caller":"flags/admin.go:111","msg":"Starting admin HTTP server","http-addr":":16687"} {"level":"info","ts":1619585729.8953238,"caller":"flags/admin.go:97","msg":"Admin server started","http.host-port":"[::]:16687","health-status":"unavailable"} {"level":"info","ts":1619585729.9169888,"caller":"config/config.go:183","msg":"Elasticsearch detected","version":7} {"level":"info","ts":1619585729.9174955,"caller":"app/static_handler.go:181","msg":"UI config path not provided, config file will not be watched"} {"level":"info","ts":1619585729.9175768,"caller":"app/server.go:170","msg":"Query server started"} {"level":"info","ts":1619585729.9175944,"caller":"healthcheck/handler.go:128","msg":"Health Check state change","status":"ready"} {"level":"info","ts":1619585729.9176183,"caller":"app/server.go:249","msg":"Starting GRPC server","port":16685,"addr":":16685"} {"level":"info","ts":1619585729.9176335,"caller":"app/server.go:230","msg":"Starting HTTP server","port":16686,"addr":":16686"}
4.查看jaeger資源
[root@VM-0-123-centos jaeger]# kubectl get all -n jaeger NAME READY STATUS RESTARTS AGE pod/jaeger-operator-6ff67bdd4b-4nffk 1/1 Running 0 14d pod/simple-prod-collector-59fc47bf5c-h26mq 1/1 Running 0 8d pod/simple-prod-query-85689b7bbd-g5jw9 2/2 Running 0 8d NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/jaeger-operator-metrics ClusterIP 172.20.253.138 <none> 8383/TCP,8686/TCP 14d service/simple-prod-collector ClusterIP 172.20.255.184 <none> 9411/TCP,14250/TCP,14267/TCP,14268/TCP 8d service/simple-prod-collector-headless ClusterIP None <none> 9411/TCP,14250/TCP,14267/TCP,14268/TCP 8d service/simple-prod-query ClusterIP 172.20.254.102 <none> 16686/TCP 8d NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/jaeger-operator 1/1 1 1 14d deployment.apps/simple-prod-collector 1/1 1 1 8d deployment.apps/simple-prod-query 1/1 1 1 8d NAME DESIRED CURRENT READY AGE replicaset.apps/jaeger-operator-6ff67bdd4b 1 1 1 14d replicaset.apps/simple-prod-collector-59fc47bf5c 1 1 1 8d replicaset.apps/simple-prod-query-85689b7bbd 1 1 1 8d NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/simple-prod-collector Deployment/simple-prod-collector 1457m/90, 137m/90 1 10 1 8d
若是流量大須要減少es壓力,能夠接入kafka集羣,修改jaeger.yaml文件
apiVersion: jaegertracing.io/v1 kind: Jaeger metadata: name: simple-streaming spec: strategy: streaming collector: options: kafka: producer: topic: jaeger-spans brokers: my-cluster-kafka-brokers.kafka:9092 #修改成kafka地址 ingester: options: kafka: consumer: topic: jaeger-spans brokers: my-cluster-kafka-brokers.kafka:9092 #修改成kafka地址 ingester: deadlockInterval: 5s storage: type: elasticsearch options: es: server-urls: http://elasticsearch:9200 #修改成ES地址
5.agent部署
jaeger client的一個代理程序,client將收集到的調用鏈數據發給agent,而後由agent發給collector。因爲使用的udp協議,通常部署在靠近client的位置。
agent有多種安裝方式
1).docker安裝
下載:jaegertracing/jaeger-agent Tags (docker.com)
docker run -d -p 6831:6831/udp -p 6832:6832/udp -p 5778:5778/tcp jaegertracing/jaeger-agent:1.12 --reporter.grpc.host-port=xx.xx.xx.xx:14250
2).k8s安裝又分兩種
sidecar方式
daemonset方式
參考:Operator for Kubernetes — Jaeger documentation (jaegertracing.io)
3).二進制安裝
下載:Jaeger – Download Jaeger (jaegertracing.io)
nohup ./jaeger-agent --collector.host-port=xxxx:14267 1>1.log 2>2.log &