Zipkin和微服務鏈路跟蹤

時間 2019-11-08

標籤 zipkin 微服鏈路跟蹤简体版

原文原文鏈接

https://cloud.tencent.com/developer/article/1082821javascript

Zipkin和微服務鏈路跟蹤

本期分享的內容是有關zipkin和分佈式跟蹤的內容。java

首先，咱們仍是經過spring initializr來新建三個項目。一個zipkin service。另外兩個是普通的業務應用，分別叫service和client。mysql

zipkin serviceweb

clientspring

servicesql

如上咱們引入了web 、zipkin client兩個依賴。編程

新建zipkin server應用json

先打開zipkin-service項目。api

咱們來看看依賴狀況：瀏覽器

<dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-zipkin</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency>

上面是默認的依賴。這裏須要把這些依賴都換掉，不然zipkin server沒法正常工做（另外就是spring boot用的版本是1.4.3.RELEASE，spring cloud版本爲

Camden.SR4）。

spring boot 版本：

<parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>1.4.3.RELEASE</version> <relativePath/> <!-- lookup parent from repository --> </parent>

spirng cloud 版本：

<dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-dependencies</artifactId> <version>Camden.SR4</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement>

依賴替換爲如下：

<dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter</artifactId> </dependency> <dependency> <groupId>io.zipkin.java</groupId> <artifactId>zipkin-server</artifactId> </dependency> <dependency> <groupId>io.zipkin.java</groupId> <artifactId>zipkin-autoconfigure-ui</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> </dependencies>

如今咱們就開始正式的開發吧。

先配置一個server port。

application.properties:

server.port=9411

而後在application類上添加@EnableZipkinServer註解。

@EnableZipkinServer
@SpringBootApplication
public class ServiceApplication { public static void main(String[] args) { SpringApplication.run(ServiceApplication.class, args); } }

而後啓動zipkin server。

http://localhost:9411/

好，如今server準備的差很少了。咱們如今去準備client吧。

新建client應用

配置端口:

server.port==9876

配置應用名稱：

spring.application.name=client

而後新建一個rest api :

@RestController
@SpringBootApplication
public class ClientApplication { @Bean RestTemplate restTemplate(){ return new RestTemplate(); } @GetMapping("/hi") public String hi(){ return this.restTemplate().getForEntity("http://localhost:8081/hi",String.class).getBody(); } public static void main(String[] args) { SpringApplication.run(ClientApplication.class, args); } }

上面的邏輯很簡單就是一個rest api，而後調用另一個service的hi服務。

新建service應用

如今新建一個 service 服務。

配置端口：

server.port=9081

配置應用名稱：

spring.application.name=service

代碼：

@SpringBootApplication
@RestController
public class ServiceApplication { @GetMapping("/hi") public String hi(){ return "Hello World"; } public static void main(String[] args) { SpringApplication.run(ServiceApplication.class, args); } }

體驗之旅

zipkin server以前已啓動。如今分別去啓動client 和 service。

而後咱們模擬調用。

在瀏覽器中輸入：

返回了「Hello World」。

如今咱們再刷新zipkin server 的ui，發現應用名稱那個下拉框已由灰色變爲了可用。

分別顯示了咱們剛纔建立的那兩個應用的應用名稱：service和client。

如今選擇client這個應用，而後看看狀況：

發現已經可以查詢出剛纔的那次調用記錄了。

而後咱們點擊進去查看具體的內容：

上面已經爲咱們展現了本次請求的深度、總共的span數量以及涉及到的服務以及總耗時。同時顯示了調用鏈路的關係，能夠發現每一個服務所耗費的時間、上下關係等。

咱們還能夠點擊具體的服務片斷，也就是span，就會彈出具體的服務的細節指標展現：

服務指標展現中你能夠看到服務片斷所在環境的ip，該請求的http method，以及path，還有所在類名稱等等。

並且還會展現該服務片斷內部的每一個請求階段的細節。

上面的展現其實都是對json數據的渲染。你能夠點擊「JSON」，而後查看更詳細更具體的數據，同時經過此瞭解zipkin的數據模型：

除了上面說的trace能力，zipkin還爲咱們提供了依賴展現。

這裏咱們只涉及到兩個服務的調用。因此依賴比較簡單。

源碼解讀及參數配置

你也許納悶，沒有作任何配置，zipkin server怎麼就會收到了數據而後展現呢？

這也太神奇了吧。其實一點都不神奇。讓咱們來看看源碼吧。

先來看看咱們引入的依賴：

<dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-zipkin</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency>

一共三個，和zipkin直接有關的就是這個：

<dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-zipkin</artifactId> </dependency>

如今找到這個jar去看看吧：

發現沒有代碼，這只是個starter，不少時候starter就是這個樣子，只是在pom中加入依賴而已：

去看看pom中有哪些依賴吧。

發現只有兩個依賴：

<dependencies> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-sleuth</artifactId> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-sleuth-zipkin</artifactId> </dependency> </dependencies>

如今進入看哪一個呢？先嚐試去看看spring-cloud-sleuth-zipkin吧，由於這個含有關鍵字zipkin，多是個過渡：

<dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-sleuth-zipkin</artifactId> </dependency>

來到spring-cloud-sleuth-zipkin包，發現了ZipKinAutoConfiguration。

進去看看吧：

@Configuration
@EnableConfigurationProperties({ZipkinProperties.class, SamplerProperties.class}) @ConditionalOnProperty(value = "spring.zipkin.enabled", matchIfMissing = true) @AutoConfigureBefore(TraceAutoConfiguration.class) public class ZipkinAutoConfiguration {

至此咱們基本能夠解釋爲何咱們沒有作任何配置，zipkin client就在後臺工做了，就是由於這裏使用了自動配置機制，也就是AutoConfiguration，讓配置自動生效。

ok，發如今該類上配置了兩個Properties：

@EnableConfigurationProperties({ZipkinProperties.class, SamplerProperties.class})

先去看看ZipkinProperties吧：

ZipkinProperties

/**
 * Zipkin settings
 */
@ConfigurationProperties("spring.zipkin") public class ZipkinProperties { /** URL of the zipkin query server instance. */ private String baseUrl = "http://localhost:9411/"; private boolean enabled = true; private int flushInterval = 1; private Compression compression = new Compression(); private Service service = new Service(); private Locator locator = new Locator(); ...

這裏只貼了field片斷。由於這就是咱們可以在application.properties中配置的zipkin屬性了。

配置zipkin server：

這裏配置了默認值。zipkin client默認會向本地的9411端口發送數據：

 private String baseUrl = "http://localhost:9411/";

在生產中，咱們就能夠在application.properties中配置本身的zipkin的地址了：

spring.zipkin.base-url=http://localhost:9511/

Flush間隔

你能夠經過如下修改flush間隔，默認是1秒：

spring.zipkin.flush-interval=1

數據壓縮支持

你也許發現了。除了幾個primitive類型的field以外，還有幾個自定義的引用類型Compression、Service、Locator。如今咱們去看看Compression吧：

/** When enabled, spans are gzipped before sent to the zipkin server */
public static class Compression { private boolean enabled = false; .... }

哦，經過註釋知道是一個支持壓縮的能力。默認是false。你能夠在配置文件中開啓壓縮，這樣在發送給zipkin server以前會先把數據進行壓縮：

spring.zipkin.compression.enabled=true

自定義service name

再來看看Service：

/** When set will override the default {@code spring.application.name} value of the service id */
public static class Service { /** The name of the service, from which the Span was sent via HTTP, that should appear in Zipkin */ private String name; ... }

默認的service name是讀取spring.application.name的值，你能夠經過如下屬性來覆蓋默認策略定義想要的service name：

spring.zipkin.service.name=service1

服務發現定位支持

Locator：

public static class Locator { private Discovery discovery; ....//skip setter getter public static class Discovery { /** Enabling of locating the host name via service discovery */ private boolean enabled; .....//skip setter getter } }

這裏你能夠支持經過服務發現來定位host name：

spring.zipkin.locator.discovery.enabled=true

配置採樣率

你也許發現了auto configuration類上有兩個properties類。一個是ZipKinProperties，一個是SamplerProperties。接下來看看SamplerProperties。

/**
 * Properties related to sampling
 */
@ConfigurationProperties("spring.sleuth.sampler") public class SamplerProperties { /** * Percentage of requests that should be sampled. E.g. 1.0 - 100% requests should be * sampled. The precision is whole-numbers only (i.e. there's no support for 0.1% of * the traces). */ private float percentage = 0.1f; }

看代碼發現就是一個採樣的配置。默認是採樣10%。要求必須是全數。好比不能是0.1%。

spring.sleuth.sampler.percentage=0.2 # 修改成20%的採樣率

自定義採樣規則

除了上面的經過配置比率的方式。你還能夠經過編程的方式自定義採樣規則。好比你能夠只對那些返回500的請求進行採樣等等。或者你決定忽略掉那些成功的請求，只對失敗的進行採樣等等。下面是對全部請求的大概一半進行採樣：

@Bean
Sampler customSampler() { return span -> Math.random() > .5; }

另外除了以上配置，還有一些sleuth的配置，這裏就不一一展開了。你能夠去spring cloud sleuth core中的autoconfiguration類查看。

基本概念

調用鏈跟蹤中有兩個比較基本的概念就是：Trace和Span。Trace就是一次真實的業務請求就是一個Trace。它也許會通過不少個Span。Span對應的就是每一個服務。一個trace會有一個trace id負責串聯全部的span。同時每一個span也有本身的id。span上又會攜帶一些元數據。其中最多見的就是調用開始時間和結束時間。你也能夠把一些業務相關的元數據攜帶到span上。

支持跟蹤的請求類型

Spring Cloud Sleuth（org.springframework.cloud:spring-cloud-starter-sleuth），一旦添加到CLASSPATH中，就會自動支持如下經常使用的組件：

經過mq技術（如Apache Kafka或RabbitMQ）（或任何其餘Spring Cloud Stream binder）進行的請求。
在Spring MVC controller收到的HTTP header。
經過Netflix Zuul傳過來的microroxy請求。
使用RestTemplate等進行的請求。

存儲

Zipkin Server經過SpanStore將寫入委託給持久層。目前，支持使用MySQL或內存式SpanStore兩種的開箱即用。默認是存儲在內存中的。

SpanStore

該接口是持久化跟蹤數據的持久化接口抽象。如下是接口的方法：

public interface SpanStore { List<List<Span>> getTraces(QueryRequest request); @Nullable List<Span> getTrace(long traceIdHigh, long traceIdLow); @Nullable List<Span> getRawTrace(long traceIdHigh, long traceIdLow); @Deprecated @Nullable List<Span> getTrace(long traceId); @Deprecated @Nullable List<Span> getRawTrace(long traceId); List<String> getServiceNames(); List<String> getSpanNames(String serviceName); List<DependencyLink> getDependencies(long endTs, @Nullable Long lookback); }

這裏只抽取第一個接口方法來看看跟蹤數據的內部結構：

List<List<Span>> getTraces(QueryRequest request);

getTraces方法的入參是一個QueryRequest。若是讓你設計這個接口的話，也許你會傳入參爲serviceName或者多個參數。

這裏使用了一個對象來把各參數傳入進去。這算是多參數查詢接口設計的不錯範例。

getTraces方法的返回值則是一個二維list。一個List<Span>是一個trace。多個List<Span>則抽象爲了一個跟蹤數據存儲庫。而後經過QueryRequest傳入查詢filter來實現查詢。

QueryRequest

查詢請求參數對象。負責把要查詢的條件封裝起來。

public final class QueryRequest { /** * 服務名稱 */ @Nullable public final String serviceName; /** span名稱，查詢出包含該span名稱的全部trace */ @Nullable public final String spanName; /** * 根據json中的元數據annotation節點中的值查詢 */ public final List<String> annotations; /** *根據json中的元數據binaryAnnotation進行查詢 */ public final Map<String, String> binaryAnnotations; /** * 響應時間大於等於此值 */ @Nullable public final Long minDuration; /** * 響應時間小於等於此值 */ @Nullable public final Long maxDuration; /** * 只顯示指定時間以前的，默認是到當前時間 */ public final long endTs; /** * 只顯示指定時間以後的，默認是到endTs，也就是從lookback到endTs這段時間的 */ public final long lookback; /** 每次查詢的數量，默認返回10條記錄 */ public final int limit;

InMemorySpanStore

該類是一個默認實現「持久化」存儲實現。加引號是由於這不是真正持久化，只是在內存中而已。該存儲方案僅僅適用於測試。

/** Internally, spans are indexed on 64-bit trace ID */
public final class InMemorySpanStore implements SpanStore {

另外zipkin支持mysql、cassandra、elasticsearch幾種存儲方案。mysql性能有點問題。生產也只能上後兩個之一了。

trace探針埋點實現

如今默認支持如上圖幾種的探針埋點實現。這裏就簡單說下。好比web就是經過filter的方式進行埋點。而hystrix則是經過從新封裝HystrixCommand來實現：

public abstract class TraceCommand<R> extends HystrixCommand<R> { ... @Override protected R run() throws Exception { String commandKeyName = getCommandKey().name(); Span span = this.tracer.createSpan(commandKeyName, this.parentSpan); this.tracer.addTag(Span.SPAN_LOCAL_COMPONENT_TAG_NAME, HYSTRIX_COMPONENT); this.tracer.addTag(this.traceKeys.getHystrix().getPrefix() + this.traceKeys.getHystrix().getCommandKey(), commandKeyName); this.tracer.addTag(this.traceKeys.getHystrix().getPrefix() + this.traceKeys.getHystrix().getCommandGroup(), getCommandGroup().name()); this.tracer.addTag(this.traceKeys.getHystrix().getPrefix() + this.traceKeys.getHystrix().getThreadPoolKey(), getThreadPoolKey().name()); try { return doRun(); } finally { this.tracer.close(span); } } public abstract R doRun() throws Exception; }

zuul則是經過ZuulFilter實現的：

public class TracePreZuulFilter extends ZuulFilter { ... @Override public Object run() { getCurrentSpan().logEvent(Span.CLIENT_SEND); return null; } @Override public ZuulFilterResult runFilter() { RequestContext ctx = RequestContext.getCurrentContext(); Span span = getCurrentSpan(); if (log.isDebugEnabled()) { log.debug("Current span is " + span + ""); } markRequestAsHandled(ctx); Span newSpan = this.tracer.createSpan(span.getName(), span); newSpan.tag(Span.SPAN_LOCAL_COMPONENT_TAG_NAME, ZUUL_COMPONENT); this.spanInjector.inject(newSpan, ctx); this.httpTraceKeysInjector.addRequestTags(newSpan, URI.create(ctx.getRequest().getRequestURI()), ctx.getRequest().getMethod()); if (log.isDebugEnabled()) { log.debug("New Zuul Span is " + newSpan + ""); } ZuulFilterResult result = super.runFilter(); if (log.isDebugEnabled()) { log.debug("Result of Zuul filter is [" + result.getStatus() + "]"); } if (ExecutionStatus.SUCCESS != result.getStatus()) { if (log.isDebugEnabled()) { log.debug("The result of Zuul filter execution was not successful thus " + "will close the current span " + newSpan); } this.tracer.close(newSpan); } return result; } // TraceFilter will not create the "fallback" span private void markRequestAsHandled(RequestContext ctx) { ctx.getRequest().setAttribute(TraceRequestAttributes.HANDLED_SPAN_REQUEST_ATTR, "true"); } ... }

scheduling則是經過切面實現的：

@Aspect
public class TraceSchedulingAspect { .... @Around("execution (@org.springframework.scheduling.annotation.Scheduled * *.*(..))") public Object traceBackgroundThread(final ProceedingJoinPoint pjp) throws Throwable { if (this.skipPattern.matcher(pjp.getTarget().getClass().getName()).matches()) { return pjp.proceed(); } String spanName = SpanNameUtil.toLowerHyphen(pjp.getSignature().getName()); Span span = this.tracer.createSpan(spanName); this.tracer.addTag(Span.SPAN_LOCAL_COMPONENT_TAG_NAME, SCHEDULED_COMPONENT); this.tracer.addTag(this.traceKeys.getAsync().getPrefix() + this.traceKeys.getAsync().getClassNameKey(), pjp.getTarget().getClass().getSimpleName()); this.tracer.addTag(this.traceKeys.getAsync().getPrefix() + this.traceKeys.getAsync().getMethodNameKey(), pjp.getSignature().getName()); try { return pjp.proceed(); } finally { this.tracer.close(span); } } }

消息中間件則是經過ExecutorChannelInterceptor來實現的：

abstract class AbstractTraceChannelInterceptor extends ChannelInterceptorAdapter implements ExecutorChannelInterceptor {

總結

分佈式鏈路跟蹤最核心的就是trace id以及span ID。基於此可以在每一個span期間挖掘元數據並同span ID一同組成一條記錄存入跟蹤記錄庫。

本文首先爲你展現瞭如何搭建一個zipkin server，而後啓動了兩個service。而後模擬發起調用請求。而後展現了zipkin server的基本使用。

而後經過查看入口源碼瞭解到了你在application.yaml中可配置的那些參數。

最後還說明了有關鏈路跟蹤調用的基本概念並展現了zipkin基本的存儲結構。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。