spring boot 2.0.3+spring cloud （Finchley）七、服務鏈路追蹤Spring Cloud Sleuth

時間 2019-11-11

標籤 spring boot 2.0.3+spring cloud finchley 服務鏈路追蹤 sleuth 欄目 Spring 简体版

原文原文鏈接

參考：Spring Cloud（十二）：分佈式鏈路跟蹤 Sleuth 與 Zipkin【Finchley 版】html

Spring Cloud Sleuth 是Spring Cloud的一個組件，主要功能是在分佈式系統中提供服務鏈路追蹤的解決方案。java

微服務架構是一個分佈式架構，微服務系統按業務劃分服務單元，一個微服務系統每每有不少個服務單元。因爲服務單元數量衆多，業務的複雜性較高，若是出現了錯誤和異常，很難去定位。主要體如今一個請求可能須要調用不少個服務，而內部服務的調用複雜性決定了問題難以定位。因此在微服務架構中，必須實現分佈式鏈路追蹤，去跟進一個請求到底有哪些服務參與，參與的順序又是怎樣的，從而達到每一個請求的步驟清晰可見，出了問題可以快速定位的目的。react

現今業界分佈式服務跟蹤的理論基礎主要來自於 Google 在2010年發的一篇論文《Dapper, a Large-Scale Distributed Systems Tracing Infrastructure》，使用最爲普遍的開源實現是 Twitter 的 Zipkin，爲了實現平臺無關、廠商無關的分佈式服務跟蹤，CNCF 發佈了布式服務跟蹤標準 Open Tracing。國內，淘寶的「鷹眼」、京東的「Hydra」、大衆點評的「CAT」、新浪的「Watchman」、惟品會的「Microscope」、窩窩網的「Tracing」都是這樣的系統。git

Spring Cloud Sleuth 也爲咱們提供了一套完整的解決方案。在本章中，咱們將詳細介紹如何使用 Spring Cloud Sleuth + Zipkin 來爲咱們的微服務架構增長分佈式服務跟蹤的能力。github

Spring Cloud Sleuth

通常的，一個分佈式服務跟蹤系統主要由三部分構成：算法

數據收集
數據存儲
數據展現

根據系統大小不一樣，每一部分的結構又有必定變化。譬如，對於大規模分佈式系統，數據存儲可分爲實時數據和全量數據兩部分，實時數據用於故障排查（Trouble Shooting），全量數據用於系統優化；數據收集除了支持平臺無關和開發語言無關係統的數據收集，還包括異步數據收集（須要跟蹤隊列中的消息，保證調用的連貫性），以及確保更小的侵入性；數據展現又涉及到數據挖掘和分析。雖然每一部分均可能變得很複雜，但基本原理都相似。spring

服務追蹤的追蹤單元是從客戶發起請求（request）抵達被追蹤系統的邊界開始，到被追蹤系統向客戶返回響應（response）爲止的過程，稱爲一個 trace。每一個 trace 中會調用若干個服務，爲了記錄調用了哪些服務，以及每次調用的消耗時間等信息，在每次調用服務時，埋入一個調用記錄，稱爲一個 span。這樣，若干個有序的 span 就組成了一個 trace。在系統向外界提供服務的過程當中，會不斷地有請求和響應發生，也就會不斷生成 trace，把這些帶有 span 的 trace 記錄下來，就能夠描繪出一幅系統的服務拓撲圖。附帶上 span 中的響應時間，以及請求成功與否等信息，就能夠在發生問題的時候，找到異常的服務；根據歷史數據，還能夠從系統總體層面分析出哪裏性能差，定位性能優化的目標。docker

Spring Cloud Sleuth 爲服務之間調用提供鏈路追蹤。經過 Sleuth 能夠很清楚的瞭解到一個服務請求通過了哪些服務，每一個服務處理花費了多長。從而讓咱們能夠很方便的理清各微服務間的調用關係。此外 Sleuth 能夠幫助咱們：數據庫

耗時分析: 經過 Sleuth 能夠很方便的瞭解到每一個採樣請求的耗時，從而分析出哪些服務調用比較耗時;
可視化錯誤: 對於程序未捕捉的異常，能夠經過集成 Zipkin 服務界面上看到;
鏈路優化: 對於調用比較頻繁的服務，能夠針對這些服務實施一些優化措施。

Spring Cloud Sleuth 能夠結合 Zipkin，將信息發送到 Zipkin，利用 Zipkin 的存儲來存儲信息，利用 Zipkin UI 來展現數據。apache

這是 Spring Cloud Sleuth 的概念圖：

Zipkin

Zipkin 是 Twitter 的一個開源項目，它基於 Google Dapper 實現，它致力於收集服務的定時數據，以解決微服務架構中的延遲問題，包括數據的收集、存儲、查找和展示。
咱們可使用它來收集各個服務器上請求鏈路的跟蹤數據，並經過它提供的 REST API 接口來輔助咱們查詢跟蹤數據以實現對分佈式系統的監控程序，從而及時地發現系統中出現的延遲升高問題並找出系統性能瓶頸的根源。除了面向開發的 API 接口以外，它也提供了方便的 UI 組件來幫助咱們直觀的搜索跟蹤信息和分析請求鏈路明細，好比：能夠查詢某段時間內各用戶請求的處理時間等。
Zipkin 提供了可插拔數據存儲方式：In-Memory、MySql、Cassandra 以及 Elasticsearch。接下來的測試爲方便直接採用 In-Memory 方式進行存儲，生產推薦 Elasticsearch。

上圖展現了 Zipkin 的基礎架構，它主要由 4 個核心組件構成：

Collector：收集器組件，它主要用於處理從外部系統發送過來的跟蹤信息，將這些信息轉換爲 Zipkin 內部處理的 Span 格式，以支持後續的存儲、分析、展現等功能。
Storage：存儲組件，它主要對處理收集器接收到的跟蹤信息，默認會將這些信息存儲在內存中，咱們也能夠修改此存儲策略，經過使用其餘存儲組件將跟蹤信息存儲到數據庫中。
RESTful API：API 組件，它主要用來提供外部訪問接口。好比給客戶端展現跟蹤信息，或是外接系統訪問以實現監控等。
Web UI：UI 組件，基於 API 組件實現的上層應用。經過 UI 組件用戶能夠方便而有直觀地查詢和分析跟蹤信息。

快速上手

Zipkin 分爲兩端，一個是 Zipkin 服務端，一個是 Zipkin 客戶端，客戶端也就是微服務的應用。
客戶端會配置服務端的 URL 地址，一旦發生服務間的調用的時候，會被配置在微服務裏面的 Sleuth 的監聽器監聽，並生成相應的 Trace 和 Span 信息發送給服務端。
發送的方式主要有兩種，一種是 HTTP 報文的方式，還有一種是消息總線的方式如 RabbitMQ。

不論哪一種方式，咱們都須要：

一個 Eureka 服務註冊中心，這裏咱們就用以前的eureka-server項目來當註冊中心。
一個 Zipkin 服務端。
兩個微服務應用。gateway-service做爲服務網關工程，負責請求的轉發，同時也做爲鏈路追蹤客戶端，負載產生鏈路數據，並上傳給Zipkin服務端。user-service是一個服務提供者，對外暴漏API接口，同時做爲鏈路追蹤客戶端，負載產生鏈路數據。

方式一：HTTP

Zipkin 服務端

關於 Zipkin 的服務端，在使用 Spring Boot 2.x 版本後，官方就不推薦自行定製編譯了，反而是直接提供了編譯好的 jar 包來給咱們使用，詳情請看 upgrade to Spring Boot 2.0 NoClassDefFoundError UndertowEmbeddedServletContainerFactory · Issue #1962 · openzipkin/zipkin · GitHub

而且之前的@EnableZipkinServer也已經被打上了@Deprecated

If you decide to make a custom server, you accept responsibility for troubleshooting your build or configuration problems, even if such problems are a reaction to a change made by the OpenZipkin maintainers. In other words, custom servers are possible, but not supported.
EnableZipkinServer.java - github.com/openzipkin/zipkin/blob/master/zipkin-server/src/main/java/zipkin/server/EnableZipkinServer.java

簡而言之就是：私自改包，後果自負。

因此官方提供了一鍵腳本（Windows下須要安裝curl，不過若是你安裝了Git客戶端，能夠直接在Git Bash中使用）

curl -sSL https://zipkin.io/quickstart.sh | bash -s
java -jar zipkin.jar

若是用 Docker 的話，直接

docker run -d -p 9411:9411 openzipkin/zipkin

任一方式啓動後，訪問 http://localhost:9411/zipkin/ 就能看到以下界面，嗯還有漢化看起來不錯

至此服務端就 OK 了。

微服務應用

構建User Service

新建Module工程user-service，做爲服務提供者，對位暴漏API接口，pom文件繼承主maven工程的pom文件，引入eureka client、zipkin的起步依賴，其中zipkin的依賴中包含了sleuth的起步依賴。

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.cralor</groupId>
    <artifactId>user-service</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <name>user-service</name>
    <description>Demo project for Spring Boot</description>

    <parent>
        <groupId>com.cralor</groupId>
        <artifactId>chap11-sleuth</artifactId>
        <version>0.0.1-SNAPSHOT</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <java.version>1.8</java.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
        </dependency>
        <!--<dependency>-->
            <!--<groupId>org.springframework.cloud</groupId>-->
            <!--<artifactId>spring-cloud-starter-sleuth</artifactId>-->
        <!--</dependency>-->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-zipkin</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

配置文件中，指定程序名user-service，端口號8762，服務註冊地址http://localhost:8761/eureka/，Zipkin Server地址http://localhost:9411。Spring Cloud Sleuth 有一個 Sampler 策略，能夠經過這個實現類來控制採樣算法。採樣器不會阻礙 span 相關 id 的產生，可是會對導出以及附加事件標籤的相關操做形成影響。 Sleuth 默認採樣算法的實現是 Reservoir sampling，具體的實現類是 PercentageBasedSampler，默認的採樣比例爲: 0.1(即 10%)。不過咱們能夠經過spring.sleuth.sampler.probability來設置，所設置的值介於 0.0 到 1.0 之間，1.0 則表示所有采集。

server:
  port: 8762
spring:
  application:
    name: user-service
  sleuth:
    sampler:
      probability: 1.0 # 將採樣比例設置爲 1.0，也就是所有都須要。默認是 0.1
  zipkin:
    base-url: http://localhost:9411 # 指定了 Zipkin 服務器的地址
eureka:
  client:
    service-url:
      defaultZone: http://localhost:8761/eureka/

在UserController類建一個「/user/hi」的API接口，對外提供服務

@RestController
@RequestMapping("/user")
public class UserController {

    @GetMapping("/hi")
    public String hi(){
        return "i'm cralor";
    }
}

構建Gateway Service

新建工程gateway-service做爲服務網關，將請求轉發到user-service。做爲zipkin客戶端，上傳鏈路數據到zipkin服務器。pom文件引入eureka client、zipkin和zuul的依賴。

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-netflix-zuul</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-zipkin</artifactId>
</dependency>

配置文件中指定程序名gateway-service，端口號5000，服務註冊地址http://localhost:8761/eureka/，Zipkin Server地址http://localhost:9411。以「/user-api/**」開頭的請求轉發到服務名爲user-service的服務。

server:
  port: 5000
spring:
  application:
    name: gateway-service
  sleuth:
    sampler:
      probability: 1.0 # 將採樣比例設置爲 1.0，也就是所有都須要。默認是 0.1
  zipkin:
    base-url: http://localhost:9411 # 指定了 Zipkin 服務器的地址

eureka:
  client:
    service-url:
      defaultZone: http://localhost:8761/eureka/
zuul:
  routes:
    api-a:
      path: /user-api/**
      serviceId: user-service  #將以"/user-api/**"開頭的Uriqq轉發到服務名爲user-service的服務

啓動類加上@EnableZuulProxy註解，開啓zuul代理功能。

@EnableZuulProxy
@SpringBootApplication
public class GatewayServiceApplication {

    public static void main(String[] args) {
        SpringApplication.run(GatewayServiceApplication.class, args);
    }
}

啓動Zipkin服務器，依次啓動eureka-server、user-service和gateway-service，瀏覽器訪問http://localhost:5000/user-api/user/hi，顯示

訪問Zipkin服務器http://localhost:9411/zipkin/，點擊 Find Traces 會看到有一條記錄

點擊記錄進去頁面，能夠看到每個服務所耗費的時間和順序

點擊依賴分析，能夠看到項目之間的調用關係

方式二：消息總線 RabbitMQ

由於以前說的 Zipkin 再也不推薦咱們來自定義 Server 端了，因此在最新版本的 Spring Cloud 依賴管理裏已經找不到 zipkin-server 了。
那麼若是直接用官方提供的 jar 包怎麼從 RabbitMQ 中獲取 trace 信息呢？

咱們能夠經過環境變量讓 Zipkin 從 RabbitMQ 中讀取信息，就像這樣：

java -jar zipkin.jar --zipkin.collector.rabbitmq.addressed=localhost

經過這種方式能夠啓動zipkin而後使用rabbitmq進行鏈路追蹤。另外在zipkin中配置的rabbitmq的用戶名和密碼是guest、guest若是你的rabbitmq用戶名密碼不是這個也要修改配置啓動。

zipkin.jar的yml配置文件內容可在此處查看：https://github.com/openzipkin/zipkin/blob/master/zipkin-server/src/main/resources/zipkin-server-shared.yml

這是配置文件的截圖

關於 Zipkin 的 Client 端，也就是微服務應用，咱們就在以前的基礎上修改，只要在他們的依賴裏都引入spring-cloud-stream-binder-rabbit就行了，別的不用改。

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-stream-binder-rabbit</artifactId>
</dependency>

不過爲了說明是經過 RabbitMQ 傳輸的信息，將spring.zipkin.base-url均改成http://localhost:9412/，即指向一個錯誤的地址。

分別重啓 user-service、gateway-service 工程，並啓動 Zipkin Serve。瀏覽器訪問http://localhost:5000/user-api/user/hi，http://localhost:9411/zipkin/，查看 RabbitMQ Admin（http://localhost:15672/）

（😭我使用RabbitMQ這個只成功了一次，後來Zipkin Serve就接受不到了，還在找緣由...😥）

請參考：https://windmt.com/2018/04/24/spring-cloud-12-sleuth-zipkin/

案例代碼地址：https://github.com/cralor7/springcloud