Spring Cloud Sleuth+ZipKin+ELK服務鏈路追蹤（七）

時間 2019-11-06

標籤 spring cloud sleuth+zipkin+elk sleuth zipkin elk 服務鏈路追蹤欄目 Spring 简体版

原文原文鏈接

序言

sleuth是spring cloud的分佈式跟蹤工具，主要記錄鏈路調用數據，自己只支持內存存儲，在業務量大的場景下，爲拉提高系統性能也可經過http傳輸數據，也可換作rabbit或者kafka來傳輸數據。java

zipkin是Twitter開源的分佈時追蹤系統，可接收數據，存儲數據（內存/cassandra/mysql/es），檢索數據，展現數據，他本神不會直接在分佈式的系統服務種trace追蹤數據，可便捷的使用sleuth來收集傳輸數據。mysql

這樣描述，你們應該很清晰啦。web

服務追蹤意義

目前流行的架構現狀，都是站在微服務架構的基礎之上，那麼勢必會產生出愈來愈多的服務，相互依賴調用，那麼若是服務調用關係以下圖所示。spring

愈來愈多的服務可能，調用關係就以下啦，一團亂麻，若是沒有服務之間的鏈路追蹤的記錄查詢方案，想快速定位問題，翻代碼都不知從何翻起，估計鎖定責任人更要撕逼一翻啦，哈哈。sql

行業方案

Google開源的 Dapper鏈路追蹤組件，並在2010年發表了論文《Dapper, a Large-Scale Distributed Systems Tracing Infrastructure》，這篇文章是業內實現鏈路追蹤的標杆和理論基礎，具備很是大的參考價值。網絡

鏈路追蹤組件有以下產品，都很贊，很值得學習：架構

Google的Dapper
Twitter的Zipkin
阿里的Eagleeye （鷹眼）
美團點評的Cat
新浪的Watchman
京東的Hydra
我的吳晟（華爲開發者）開源的skywalking (很贊)
韓國團隊naver團隊開源pinpoint

有時間你們學習一番啊。app

Sleuth鏈路追蹤專業術語

Spring Cloud Sleuth採用的是Google的開源項目Dapper的專業術語。elasticsearch

Span：基本工做單元，例如，在一個新建的span中發送一個RPC等同於發送一個迴應請求給RPC，span經過一個64位ID惟一標識，trace以另外一個64位ID表示，span還有其餘數據信息，好比摘要、時間戳事件、關鍵值註釋(tags)、span的ID、以及進度ID(一般是IP地址),span在不斷的啓動和中止，同時記錄了時間信息，當你建立了一個span，你必須在將來的某個時刻中止它。
Trace：一系列spans組成的一個樹狀結構，例如，若是你正在跑一個分佈式大數據工程，你可能須要建立一個trace。
Annotation：用來及時記錄一個事件的存在，一些核心annotations用來定義一個請求的開始和結束
- cs - Client Sent -客戶端發起一個請求，這個annotion描述了這個span的開始
- sr - Server Received -服務端得到請求並準備開始處理它，若是將其sr減去cs時間戳即可獲得網絡延遲
- ss - Server Sent -註解代表請求處理的完成(當請求返回客戶端)，若是ss減去sr時間戳即可獲得服務端須要的處理請求時間
- cr - Client Received -代表span的結束，客戶端成功接收到服務端的回覆，若是cr減去cs時間戳即可獲得客戶端從服務端獲取回覆的全部所需時間

將Span和Trace在一個系統中使用Zipkin註解的過程圖形化：分佈式

trace id 整個鏈路中是惟一不變的，這樣也方便查詢。

zipkin介紹

zipkin主要有四個組件：collector，storage，API，web UI。collector用於收集各服務發送到zipkin的數據，storage用於存儲這些鏈路數據，目前支持Cassandra，ElasticSearch（推薦使用，易於大規模擴展）和MySQL，API用來查找和檢索跟蹤鏈，提供給界面UI展現。

鏈路的追蹤原理：跟蹤器位於應用程序中，記錄發生的操做的時間和元數據，收集的跟蹤數據稱爲Span，將數據發送到Zipkin的儀器化應用程序中的組件稱爲Reporter，Reporter經過幾種傳輸方式（http，kafka）之一將追蹤數據發送到Zipkin收集器(collector)，而後將跟蹤數據進行存儲(storage)，由API查詢存儲以向UI提供數

具體項目搭建

上面是個人示例項目。

1.trade-zipkin-server是zipkinserver，是用來展現，搜索，存儲trade追蹤數據用的。

2.shop-->order-->shouhou & promotion（簡單的調用鏈路，這裏是具體須要的業務鏈路追蹤的trace項目哈）。

zipkinserver配置代碼

@EnableZipkinServer
public class StartMain {
    public static void main(String[] args) {
        SpringApplication.run(StartMain.class, args);
    }
}

    <dependency>
            <groupId>io.zipkin.java</groupId>
            <artifactId>zipkin-server</artifactId>
            <version>2.11.8</version>
        </dependency>
        <dependency>
            <groupId>io.zipkin.java</groupId>
            <artifactId>zipkin-autoconfigure-ui</artifactId>
            <version>2.11.8</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-zipkin</artifactId>
        </dependency>

業務項目配置

spring.sleuth.enabled=true
spring.sleuth.sampler.percentage=1
spring.zipkin.base-url=http://localhost:8087

 <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-sleuth</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-sleuth-zipkin</artifactId>
        </dependency>

note：

spring.sleuth.sampler.percentage參數配置（若是不配置默認0.1），若是咱們調大此值爲1，能夠看到信息收集就更及時。可是當這樣調整後，咱們會發現咱們的rest接口調用速度比0.1的狀況下慢了不少，即便在0.1的採樣率下，咱們屢次刷新consumer的接口，會發現對同一個請求兩次耗時信息相差很是大，若是取消spring-cloud-sleuth後咱們再測試，會發現並無這種狀況，能夠看到這種方式追蹤服務調用鏈路會給咱們業務程序性能帶來必定的影響。

zipkin收集展現數據界面以下：

seluth+zipkin數據寫入Elasticsearch，使用kibana展現

配置zipkinserver

<dependency>
            <groupId>io.zipkin.java</groupId>
            <artifactId>zipkin-autoconfigure-storage-elasticsearch-http</artifactId>
            <version>2.8.4</version>
        </dependency>

zipkin.storage.StorageComponent=elasticsearch
zipkin.storage.type=elasticsearch
#能夠作集羣，我用的本地測試沒有部署elastic集羣
zipkin.storage.elasticsearch.hosts=es.me.com
zipkin.storage.elasticsearch.cluster=iron-man
zipkin.storage.elasticsearch.index=trade-zipkin
zipkin.storage.elasticsearch.index-shards=5
zipkin.storage.elasticsearch.index-replicas=1