springboot(2.1.0)+springcloud(Greenwich.M1)實現鏈路追蹤

時間 2019-11-08

標籤 springboot 2.1.0 springcloud greenwich.m1 greenwich 實現鏈路追蹤欄目 Spring 简体版

原文原文鏈接

主要問題

因爲springboot新版本(2.1.0)、springcloud新版本(Greenwich.M1)實現鏈路追蹤sleuth+zipkin的一些「新特性」，使得我在實現sleuth+zipkin的過程上踩了很多坑。javascript

在springboot1.X版本的時候，實現鏈路追蹤服務須要用戶本身實現client以及server，一般在server服務端須要引入各類各樣的包(spring-cloud-sleuth-stream，以及支持zipkin的一些相關依賴包等等)；css

但在spring cloud新版本實現鏈路追蹤sleuth+zipkin的方式上已經再也不須要本身再去實現一個server服務端（集成sleuth+zipkin），而是由zinkin官方提供了一個現成的zipkin-server.jar，或者是一個docker鏡像，用戶能夠下載並經過命令進行啓動它，用戶能夠通一些配置來肯定sleuth收集到信息後傳輸到zipkin之間採用http,仍是經過rabbit/kafka的方式。在新的版本下，用戶只須要關注slenth-client選用何種傳輸方式（http或mq（rabbit/kafka），若是選擇http,則在配置中指明base-url；若是選擇mq,則在配置指明相關消息中間件的相關信息host/port/username/password...），至於zipkin的信息storage問題,則由zipkin-server要負責，能夠經過zipkin-server.jar 配置一些具體的參數來啓動。（下面會細講）html

ps:這不是教程貼，這主要是解決一些問題的一些方法，不會有詳細的實現過程，但爲了簡明我會貼上部分代碼。java

背景

最近開始實習了，老大讓我自學一下sc(spring cloud)，學就學嘛，也不是難事。看完spring cloud的全家桶,老大說讓我重點了解一下它的鏈路追蹤服務，後期會有這方面的任務安排給我作，因此呢我就重點關注這一方面，打算本身作個demo練練手，看了網上的教程，膨脹的我選擇了個最新的版本，結果發現就這麼掉坑裏了。。。mysql

版本

按照慣例，先說下springboot跟spring cloud的版本
springboot：2.1.0
springcloud：Greenwich.M1
我的建議新手不要過度追求新版本，舊版本的仍是夠用的,比springboot 2.6.0搭配sringcloud Finchley SR2仍是挺穩的，若是真的要探索新版本你會發現這裏面的坑實在是踩不完，基本要花個一兩天才能讓本身從坑裏跳出去，這樣頻繁踩坑會讓新手很容易放棄~~~
ps：不要問我爲何知道。。。git

正題

閒話扯完了，能夠進入正題了
一共四個服務
eureka-server
zipkin-server：新版本的zipkin服務端，負責接受sleuth發送過來的數據，完成處理、存儲、創建索引，而且提供了一個可視化的ui數據分析界面。
須要的同窗話能夠直接在github上下載https://github.com/openzipkin...github

嗯就是這兩個傢伙
下面兩個是兩個服務web

eureka-server服務註冊中心，這個實現我就不講了，網上搜一大把，各個版本實現基本都是一致的，並不存在版本更新跨度極大的狀況。並且這裏我把它是打包成一個jar包,在須要的時候直接用java -jar XXX.jar 直接啓動spring

至於product跟order(也即實際場景下各類種樣的服務A、B、C...)sql

order服務只有一個接口/test，去調用product的接口

這裏的productclient就是使用feignf去調用order的/product/list接口

product只有一個接口/product/list，查找全部商品的列表

簡單的來講，這裏的場景就是order服務--（去調用）-->product服務

說完場景後，貼一下這兩個服務的相關配置信息(order跟producet的配置基本上是相同的）
application.yml

spring:
  application:
    #服務名
    name: product
  #因爲業務邏輯須要操做數據庫，因此這裏配置了mysql的一些信息
  datasource:
    driver-class-name: com.mysql.jdbc.Driver
    username: root
    password: 123456
    url: jdbc:mysql://127.0.0.1:3306/sc_sell?characterEncoding=utf-8&useSSL=false&serverTimezone=Asia/Shanghai
  jpa:
    show-sql: true
  #重點
  zipkin:
    #base-url:當你設置sleuth-cli收集信息後經過http傳輸到zinkin-server時，須要在這裏配置
    base-url: http://localhost:9411
    enabled: true
  sleuth:
    sampler:
      #收集追蹤信息的比率，若是是0.1則表示只記錄10%的追蹤數據，若是要所有追蹤，設置爲1（實際場景不推薦，由於會形成不小的性能消耗）
      probability: 1
eureka:
  client:
    service-url:
    #註冊中心地址
      defaultZone: http://localhost:8999/eureka/
logging:
  level:
    #這個是設置feign的一個日誌級別,key-val的形式設置
    org.springframework.cloud.openfeign: debug

說完配置信息，就該講一下依賴了，很簡單，client實現鏈路追蹤只須要添加一個依賴spring-cloud-starter-zipkin。就是這個

<dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-zipkin</artifactId>
        </dependency>

其實這些都是基礎操做，是吧，那麼來點進階的。
從上面的例子上來看，其實仍是有幾個問題須要考慮一下。

有點開發經驗的人都會發現，首先它是基於http協議傳輸的，http協議傳輸有個很差的地方就是，它是短鏈接，即須要頻繁經過三次握手創建連接，這在追蹤不少服務時會形成不小的性能消耗。
另外還有一個問題：對於直接傳輸的方式，有個弊端就是一旦接收方意外斷開鏈接，那麼在傳輸鏈路中的一些數據將會丟失，若是這些數據是關鍵數據，那麼後果將是很是嚴重的。一樣一些場景下須要保存鏈路追蹤的數據，以備後面觀察對比，因此一樣須要一個db來存儲數據。

因此對於以上的問題，仍是須要去考慮，值得欣慰的是，zipkin在這兩個方面也做了很nice的解決方案，在實現過程當中只須要稍做配置便可。

在sleuth-cli跟zipkin-server之間插入一個消息中間件rabbitmq/kafka，這裏我舉例中只使用rabbitmq來實現
將鏈路追蹤的數據存儲到DB上，目前zipkin暫時只支持mysql/elasticsearch,這裏我使用mysql

若是你是剛開始學習sc，給你去實現的話，你確定會開始打開瀏覽器開始搜索教程。
結果你會發現，大部分博客上都是之前版本的實現方式，一些較舊會讓你本身實現一個zipkin-server（我懷疑他們的版本是1.x）,你會發現很鬱悶，由於這跟你想象的不太同樣啊。
繼續找，終於在茫茫帖子中，找到了一篇是關於springboot2.0.X版本的實現鏈路追蹤的教程，這時候你會興奮，終於找到靠譜一點的啊，喜出望外有木有啊，可是，事情還沒完，它會讓你在客戶端依賴下面這個依賴包

<dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-sleuth-zipkin-stream</artifactId>
        </dependency>
         <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-sleuth-stream</artifactId>
        </dependency>

結果你會發現，你在依賴它的時候，實際上是依賴不了，爲何？由於版本的問題，什麼？你跟我說你的pom文件沒報錯啊，可是，你打開idea右邊的maven插件看一下

這真的是一個巨坑，我一直不明白是怎麼回事，直到有一次，我打開了這個頁面，花了我一天的時間去摸索是什麼緣由形成的集成rabbitmq失敗，真的是被安排得明明白白，最後我發現，這條路行不通啊

最後，豪無頭緒的我，繼續在網上查找一些springboot2.x版本的一些鏈路追蹤的教程，在搜索了一個下午，我忽然想起，誒不對，我應該直接去官網看它的官方教程的啊。。。雖然都英文，大不了我用chrome自帶的翻譯工具翻譯一下咯。結果就立馬打開spring的官網，選擇了最新的版本，進去找了一下，還真的讓我找到，還特別簡單！！！
傳送門：https://cloud.spring.io/sprin...
官方文檔是這麼說的。

意思大概是說：若是你想使用rabbitmq或kafka替換掉http,添加spring-rabbit或spring-kafka的依賴包，默認目標名是zipkin(隊列名),若是你使用kafka/mysql，你須要設置屬性：spring-zipkin-sender-type=kafka/mysql
也就是說，只須要引入下面這兩個依賴包！！！

<dependency> 
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-zipkin</artifactId>
</dependency>
<dependency> 
    <groupId>org.springframework.amqp</groupId>
    <artifactId>spring-rabbit</artifactId>
</dependency>

再往下看，你會發現有一個提示

spring-cloud-sleuth-stream已經被棄用，再也不與這個版本新內容。。。
因此如今再回過頭去看，你會知道爲何在上一個嘗試中引入spring-cloud-sleuth-stream會無效了。

再修改下application.yml的配置信息，只須要註釋掉base-url,修改zipkin.sender.type=rabiit，再配置一下rabbitmq的一些信息，就大功告成。

zipkin:
#  內存方式配置：可不配
#    base-url: http://localhost:9411/
    sender:
      type: rabbit
  rabbitmq:
    host: localhost
    port: 5672
    username: guest
    password: guest

到這裏，你就已經把order/poduct的鏈路追蹤部分作好了。

咱們上面講了sleuth負責收集數據，zipkin負責接收sleuth收集後發送過來的追蹤信息，處理、存儲、索引、提供ui，因此接下來，就是來實現zipkin-server的從rabbitmq隊列取出追蹤數據，並存儲在mysql數據中這一功能了。

對於zipkin-server如何去實現，其實zinkin官網已經給咱們作了功能的集成，只須要在啓動的時候，設置參數便可，下面就來說一下

對於須要根據什麼場景設置什麼樣的參數的問題，我不會具體講解應該怎麼設置，由於我也只是剛接觸sc不久，一些場景我也不是很熟悉，但我會講怎麼去找咱們須要的一些參數。

方法一，經過修改基配置文件後啓動。

首先，咱們用解壓工具解壓一下zipkin-server.jar這個壓縮包，解壓出來有三個文件夾，裏面大部分都是.class文件。

而後咱們進入BOOT-INFclasses目錄下，你會發現有兩個.yml文件，沒錯這就是yml的配置文件了

其中zipkin-server.yml就是zinpkin-server主要的配置文件了，但你打開後會發現，其實裏面只有一行配置，spring.profiles.include: shared
,即引入shared.yml文件，因此這裏咱們主要看zinkin-serer-shared.yml文件。
打開zinkin-serer-shared.yml

zipkin:
  self-tracing:
    # Set to true to enable self-tracing.
    enabled: ${SELF_TRACING_ENABLED:false}
    # percentage to self-traces to retain
    sample-rate: ${SELF_TRACING_SAMPLE_RATE:1.0}
    # Timeout in seconds to flush self-tracing data to storage.
    message-timeout: ${SELF_TRACING_FLUSH_INTERVAL:1}
  collector:
    # percentage to traces to retain
    sample-rate: ${COLLECTOR_SAMPLE_RATE:1.0}
    http:
      # Set to false to disable creation of spans via HTTP collector API
      enabled: ${HTTP_COLLECTOR_ENABLED:true}
    kafka:
      # Kafka bootstrap broker list, comma-separated host:port values. Setting this activates the
      # Kafka 0.10+ collector.
      bootstrap-servers: ${KAFKA_BOOTSTRAP_SERVERS:}
      # Name of topic to poll for spans
      topic: ${KAFKA_TOPIC:zipkin}
      # Consumer group this process is consuming on behalf of.
      group-id: ${KAFKA_GROUP_ID:zipkin}
      # Count of consumer threads consuming the topic
      streams: ${KAFKA_STREAMS:1}
    rabbitmq:
      # RabbitMQ server address list (comma-separated list of host:port)
      addresses: ${RABBIT_ADDRESSES:}
      concurrency: ${RABBIT_CONCURRENCY:1}
      # TCP connection timeout in milliseconds
      connection-timeout: ${RABBIT_CONNECTION_TIMEOUT:60000}
      password: ${RABBIT_PASSWORD:guest}
      queue: ${RABBIT_QUEUE:zipkin}
      username: ${RABBIT_USER:guest}
      virtual-host: ${RABBIT_VIRTUAL_HOST:/}
      useSsl: ${RABBIT_USE_SSL:false}
      uri: ${RABBIT_URI:}
  query:
    enabled: ${QUERY_ENABLED:true}
    # 1 day in millis
    lookback: ${QUERY_LOOKBACK:86400000}
    # The Cache-Control max-age (seconds) for /api/v2/services and /api/v2/spans
    names-max-age: 300
    # CORS allowed-origins.
    allowed-origins: "*"

  storage:
    strict-trace-id: ${STRICT_TRACE_ID:true}
    search-enabled: ${SEARCH_ENABLED:true}
    type: ${STORAGE_TYPE:mem}
    mem:
      # Maximum number of spans to keep in memory.  When exceeded, oldest traces (and their spans) will be purged.
      # A safe estimate is 1K of memory per span (each span with 2 annotations + 1 binary annotation), plus
      # 100 MB for a safety buffer.  You'll need to verify in your own environment.
      # Experimentally, it works with: max-spans of 500000 with JRE argument -Xmx600m.
      max-spans: 500000
    cassandra:
      # Comma separated list of host addresses part of Cassandra cluster. Ports default to 9042 but you can also specify a custom port with 'host:port'.
      contact-points: ${CASSANDRA_CONTACT_POINTS:localhost}
      # Name of the datacenter that will be considered "local" for latency load balancing. When unset, load-balancing is round-robin.
      local-dc: ${CASSANDRA_LOCAL_DC:}
      # Will throw an exception on startup if authentication fails.
      username: ${CASSANDRA_USERNAME:}
      password: ${CASSANDRA_PASSWORD:}
      keyspace: ${CASSANDRA_KEYSPACE:zipkin}
      # Max pooled connections per datacenter-local host.
      max-connections: ${CASSANDRA_MAX_CONNECTIONS:8}
      # Ensuring that schema exists, if enabled tries to execute script /zipkin-cassandra-core/resources/cassandra-schema-cql3.txt.
      ensure-schema: ${CASSANDRA_ENSURE_SCHEMA:true}
      # 7 days in seconds
      span-ttl: ${CASSANDRA_SPAN_TTL:604800}
      # 3 days in seconds
      index-ttl: ${CASSANDRA_INDEX_TTL:259200}
      # the maximum trace index metadata entries to cache
      index-cache-max: ${CASSANDRA_INDEX_CACHE_MAX:100000}
      # how long to cache index metadata about a trace. 1 minute in seconds
      index-cache-ttl: ${CASSANDRA_INDEX_CACHE_TTL:60}
      # how many more index rows to fetch than the user-supplied query limit
      index-fetch-multiplier: ${CASSANDRA_INDEX_FETCH_MULTIPLIER:3}
      # Using ssl for connection, rely on Keystore
      use-ssl: ${CASSANDRA_USE_SSL:false}
    cassandra3:
      # Comma separated list of host addresses part of Cassandra cluster. Ports default to 9042 but you can also specify a custom port with 'host:port'.
      contact-points: ${CASSANDRA_CONTACT_POINTS:localhost}
      # Name of the datacenter that will be considered "local" for latency load balancing. When unset, load-balancing is round-robin.
      local-dc: ${CASSANDRA_LOCAL_DC:}
      # Will throw an exception on startup if authentication fails.
      username: ${CASSANDRA_USERNAME:}
      password: ${CASSANDRA_PASSWORD:}
      keyspace: ${CASSANDRA_KEYSPACE:zipkin2}
      # Max pooled connections per datacenter-local host.
      max-connections: ${CASSANDRA_MAX_CONNECTIONS:8}
      # Ensuring that schema exists, if enabled tries to execute script /zipkin2-schema.cql
      ensure-schema: ${CASSANDRA_ENSURE_SCHEMA:true}
      # how many more index rows to fetch than the user-supplied query limit
      index-fetch-multiplier: ${CASSANDRA_INDEX_FETCH_MULTIPLIER:3}
      # Using ssl for connection, rely on Keystore
      use-ssl: ${CASSANDRA_USE_SSL:false}
    elasticsearch:
      # host is left unset intentionally, to defer the decision
      hosts: ${ES_HOSTS:}
      pipeline: ${ES_PIPELINE:}
      max-requests: ${ES_MAX_REQUESTS:64}
      timeout: ${ES_TIMEOUT:10000}
      index: ${ES_INDEX:zipkin}
      date-separator: ${ES_DATE_SEPARATOR:-}
      index-shards: ${ES_INDEX_SHARDS:5}
      index-replicas: ${ES_INDEX_REPLICAS:1}
      username: ${ES_USERNAME:}
      password: ${ES_PASSWORD:}
      http-logging: ${ES_HTTP_LOGGING:}
      legacy-reads-enabled: ${ES_LEGACY_READS_ENABLED:true}
    mysql:
      jdbc-url: ${MYSQL_JDBC_URL:}
      host: ${MYSQL_HOST:localhost}
      port: ${MYSQL_TCP_PORT:3306}
      username: ${MYSQL_USER:}
      password: ${MYSQL_PASS:}
      db: ${MYSQL_DB:zipkin}
      max-active: ${MYSQL_MAX_CONNECTIONS:10}
      use-ssl: ${MYSQL_USE_SSL:false}
  ui:
    enabled: ${QUERY_ENABLED:true}
    ## Values below here are mapped to ZipkinUiProperties, served as /config.json
    # Default limit for Find Traces
    query-limit: 10
    # The value here becomes a label in the top-right corner
    environment:
    # Default duration to look back when finding traces.
    # Affects the "Start time" element in the UI. 1 hour in millis
    default-lookback: 3600000
    # When false, disables the "find a trace" screen
    search-enabled: ${SEARCH_ENABLED:true}
    # Which sites this Zipkin UI covers. Regex syntax. (e.g. http:\/\/example.com\/.*)
    # Multiple sites can be specified, e.g.
    # - .*example1.com
    # - .*example2.com
    # Default is "match all websites"
    instrumented: .*
    # URL placed into the <base> tag in the HTML
    base-path: /zipkin

server:
  port: ${QUERY_PORT:9411}
  use-forward-headers: true
  compression:
    enabled: true
    # compresses any response over min-response-size (default is 2KiB)
    # Includes dynamic json content and large static assets from zipkin-ui
    mime-types: application/json,application/javascript,text/css,image/svg

spring:
  jmx:
     # reduce startup time by excluding unexposed JMX service
     enabled: false
  mvc:
    favicon:
      # zipkin has its own favicon
      enabled: false
  autoconfigure:
    exclude:
      # otherwise we might initialize even when not needed (ex when storage type is cassandra)
      - org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration
info:
  zipkin:
    version: "2.11.8"

logging:
  pattern:
    level: "%clr(%5p) %clr([%X{traceId}/%X{spanId}]){yellow}"
  level:
    # Silence Invalid method name: '__can__finagle__trace__v3__'
    com.facebook.swift.service.ThriftServiceProcessor: 'OFF'
#     # investigate /api/v2/dependencies
#     zipkin2.internal.DependencyLinker: 'DEBUG'
#     # log cassandra queries (DEBUG is without values)
#     com.datastax.driver.core.QueryLogger: 'TRACE'
#     # log cassandra trace propagation
#     com.datastax.driver.core.Message: 'TRACE'
#     # log reason behind http collector dropped messages
#     zipkin2.server.ZipkinHttpCollector: 'DEBUG'
#     zipkin2.collector.kafka.KafkaCollector: 'DEBUG'
#     zipkin2.collector.kafka08.KafkaCollector: 'DEBUG'
#     zipkin2.collector.rabbitmq.RabbitMQCollector: 'DEBUG'
#     zipkin2.collector.scribe.ScribeCollector: 'DEBUG'

management:
  endpoints:
    web:
      exposure:
        include: '*'
  endpoint:
    health:
      show-details: always
# Disabling auto time http requests since it is added in Undertow HttpHandler in Zipkin autoconfigure
# Prometheus module. In Zipkin we use different naming for the http requests duration
  metrics:
    web:
      server:
        auto-time-requests: false

這其實就是配置文件，對於須要使用的組件，其實就是隻修改對應的配置，好比我須要使用storage，讓它把追蹤數據保存到mysql中，那麼我只須要修改對應的配置信息：

storage:
    #其實部分不須要修改,省略掉
    mysql:
      jdbc-url: jdbc:sqlserver://localhost?XXX=xxx;
      host: localhost
      port: 3306
      username: root
      password: 123456
      db: zipkin
      #最大鏈接數
      max-active: ${MYSQL_MAX_CONNECTIONS:10}
      #是否使用ssl
      use-ssl: ${MYSQL_USE_SSL:false}

修改完配置，咱們從新壓縮成一個jar包，直接啓動便可。

方法二，經過使用環境變量的方式來啓動zipkin-server.jar服務。

直接使用java -jar zipkin-server.jar --zipkin.storage.mysql.username=root --zipkin.storage.mysql.password=123456 --zipkin.storage.mysql.host=localhost --zipkin.storage.mysql.port=3306 ...
後面接上的便是它的環境變量，至於環境變量有哪些，請看方法一的yml文件，都是一一對應的。這種方法好片就是不須要修改jar包，但就是須要後面接上一串較長的環境變量聲明。

好了，基本上就已經結束了。其實配置都是一樣的原理。可以觸類旁通天然其它相關配置都不是什麼問題。

總結

更新過程當中由於比較忙中間還沒寫完就發表了，致使內容欠缺，今天終於利用週末的時間補上了，萬幸。第一篇文章，主要記錄本身的踩坑經歷，非專業的寫教程，大都是一些隨心的記錄，若是有什麼看不懂的，歡迎留下你的問題，一樣，若是哪些地方寫得有誤，望您不吝賜教，幫我指出一些錯誤，謝謝。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。