微服務極大地改變了軟件的開發和交付模式,單體應用被拆分爲多個微服務,單個服務的複雜度大幅下降,庫之間的依賴也轉變爲服務之間的依賴。由此帶來的問題是部署的粒度變得愈來愈細,衆多服務給運維帶來巨大壓力,不過好在咱們有 Kubernetes,能夠解決大部分運維方面的難題。前端
隨着服務數量的增多和內部調用鏈的複雜化,僅憑藉日誌和性能監控很難作到 「See the Whole Picture」,在進行問題排查或是性能分析的時候,無異於盲人摸象。分佈式追蹤可以幫助開發者直觀分析請求鏈路,快速定位性能瓶頸,逐漸優化服務間依賴,也有助於開發者從更宏觀的角度更好地理解整個分佈式系統。java
分佈式追蹤系統大致分爲三個部分,數據採集、數據持久化、數據展現。數據採集是指在代碼中埋點,設置請求中要上報的階段,以及設置當前記錄的階段隸屬於哪一個上級階段。數據持久化則是指將上報的數據落盤存儲,例如 Jaeger 就支持多種存儲後端,可選用 Cassandra 或者 Elasticsearch。數據展現則是前端根據 Trace ID 查詢與之關聯的請求階段,並在界面上呈現。
git
微服務通信架構圖github
早在 2005 年,Google 就在內部部署了一套分佈式追蹤系統 Dapper,並發表了一篇論文《Dapper, a Large-Scale Distributed Systems Tracing Infrastructure》,闡述了該分佈式追蹤系統的設計和實現,能夠視爲分佈式追蹤領域的鼻祖。隨後出現了受此啓發的開源實現,如 Zipkin、SourceGraph 開源的 Appdash、Red Hat 的 Hawkular APM、Uber 開源的 Jaeger 等。但各家的分佈式追蹤方案是互不兼容的,這才誕生了 OpenTracing。OpenTracing 是一個 Library,定義了一套通用的數據上報接口,要求各個分佈式追蹤系統都來實現這套接口。這樣一來,應用程序只須要對接 OpenTracing,而無需關心後端採用的到底什麼分佈式追蹤系統,所以開發者能夠無縫切換分佈式追蹤系統,也使得在通用代碼庫增長對分佈式追蹤的支持成爲可能。golang
目前,主流的分佈式追蹤實現基本都已經支持 OpenTracing,包括 Jaeger、Zipkin、Appdash 等,具體可參考官方文檔 《Supported Tracer Implementations》。docker
這部分在 OpenTracing 的規範中寫的很是清楚,下面只大概翻譯一下其中的關鍵部分,細節可參考原始文檔 《The OpenTracing Semantic Specification》。後端
Causal relationships between Spans in a single Trace [Span A] ←←←(the root span) | +------+------+ | | [Span B] [Span C] ←←←(Span C is a `ChildOf` Span A) | | [Span D] +---+-------+ | | [Span E] [Span F] >>> [Span G] >>> [Span H] ↑ ↑ ↑ (Span G `FollowsFrom` Span F)
Trace 是調用鏈,每一個調用鏈由多個 Span 組成。Span 的單詞含義是範圍,能夠理解爲某個處理階段。Span 和 Span 的關係稱爲 Reference。上圖中,總共有標號爲 A-H 的 8 個階段。api
Temporal relationships between Spans in a single Trace ––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time [Span A···················································] [Span B··············································] [Span D··········································] [Span C········································] [Span E·······] [Span F··] [Span G··] [Span H··]
上圖是按照時間順序呈現的調用鏈。架構
每一個階段(Span)包含以下狀態:併發
階段(Span)能夠有 ChildOf 和 FollowsFrom 兩種引用關係。ChildOf 用於表示父子關係,即在某個階段中發生了另外一個階段,是最多見的階段關係,典型的場景如調用 RPC 接口、執行 SQL、寫數據。FollowsFrom 表示跟隨關係,意爲在某個階段以後發生了另外一個階段,用來描述順序執行關係。
ChildOf relationship means that the rootSpan has a logical dependency on the child span before rootSpan can complete its operation. Another standard reference type in OpenTracing is FollowsFrom, which means the rootSpan is the ancestor in the DAG, but it does not depend on the completion of the child span, for example if the child represents a best-effort, fire-and-forget cache write.
一個trace表明一個潛在的,分佈式的,存在並行數據或並行執行軌跡(潛在的分佈式、並行)的系統。一個trace能夠認爲是多個span的有向無環圖(DAG)。
一個span表明系統中具備開始時間和執行時長的邏輯運行單元。span之間經過嵌套或者順序排列創建邏輯因果關係。
每個span都有一個操做名稱,這個名稱簡單,並具備可讀性高。(例如:一個RPC方法的名稱,一個函數名,或者一個大型計算過程當中的子任務或階段)。span的操做名應該是一個抽象、通用的標識,可以明確的、具備統計意義的名稱;更具體的子類型的描述,請使用Tags
例如,假設一個獲取帳戶信息的span會有以下可能的名稱:
| Operation Name | Guidance |
|:---------------|:--------|
| get
| Too general |
| get_account/792
| Too specific |
| get_account
| Good, and account_id=792
would make a nice Span
tag |
A Span may reference zero or more other SpanContexts that are causally related. OpenTracing presently defines two types of references: ChildOf
and FollowsFrom
. Both reference types specifically model direct causal relationships between a child Span and a parent Span. In the future, OpenTracing may also support reference types for Spans with non-causal relationships (e.g., Spans that are batched together, Spans that are stuck in the same queue, etc).
ChildOf
references: A Span may be the ChildOf
a parent Span. In a ChildOf
reference, the parent Span depends on the child Span in some capacity. All of the following would constitute ChildOf
relationships:
ChildOf
a Span representing the client side of that RPCChildOf
a Span representing an ORM save methodChildOf
a single parent Span that merges the results for all children that return within a deadlineThese could all be valid timing diagrams for children that are the ChildOf
a parent.
[-Parent Span---------] [-Child Span----] [-Parent Span--------------] [-Child Span A----] [-Child Span B----] [-Child Span C----] [-Child Span D---------------] [-Child Span E----]
FollowsFrom
references: Some parent Spans do not depend in any way on the result of their child Spans. In these cases, we say merely that the child Span FollowsFrom
the parent Span in a causal sense. There are many distinct FollowsFrom
reference sub-categories, and in future versions of OpenTracing they may be distinguished more formally.
These can all be valid timing diagrams for children that "FollowFrom" a parent.
[-Parent Span-] [-Child Span-] [-Parent Span--] [-Child Span-] [-Parent Span-] [-Child Span-]
每一個span能夠進行屢次Logs操做,每一次Logs操做,都須要一個帶時間戳的時間名稱,以及可選的任意大小的存儲結構。
標準中定義了一些日誌(logging)操做的一些常見用例和相關的log事件的鍵值,可參考Data Conventions Guidelines 數據約定指南。
每一個span能夠有多個鍵值對(key:value)形式的Tags,Tags是沒有時間戳的,支持簡單的對span進行註解和補充。
和使用Logs的場景同樣,對於應用程序特定場景已知的鍵值對Tags,tracer能夠對他們特別關注一下。更多信息,可參考Data Conventions Guidelines 數據約定指南。
每一個span必須提供方法訪問SpanContext。SpanContext表明跨越進程邊界,傳遞到下級span的狀態。(例如,包含<trace_id, span_id, sampled>元組),並用於封裝Baggage (關於Baggage的解釋,請參考下文)。SpanContext在跨越進程邊界,和在追蹤圖中建立邊界的時候會使用。(ChildOf關係或者其餘關係,參考Span間關係 )。
Baggage是存儲在SpanContext中的一個鍵值對(SpanContext)集合。它會在一條追蹤鏈路上的全部span內全局傳輸,包含這些span對應的SpanContexts。在這種狀況下,"Baggage"會隨着trace一同傳播,他所以得名(Baggage可理解爲隨着trace運行過程傳送的行李)。鑑於全棧OpenTracing集成的須要,Baggage經過透明化的傳輸任意應用程序的數據,實現強大的功能。例如:能夠在最終用戶的手機端添加一個Baggage元素,並經過分佈式追蹤系統傳遞到存儲層,而後再經過反向構建調用棧,定位過程當中消耗很大的SQL查詢語句。
Baggage擁有強大功能,也會有很大的消耗。因爲Baggage的全局傳輸,若是包含的數量量太大,或者元素太多,它將下降系統的吞吐量或增長RPC的延遲。
SpanContexts能夠經過Injected操做向Carrier增長,或者經過Extracted從Carrier中獲取,跨進程通信數據(例如:HTTP頭)。經過這種方式,SpanContexts能夠跨越進程邊界,並提供足夠的信息來創建跨進程的span間關係(所以能夠實現跨進程連續追蹤)。
每個平臺的OpenTracing API庫(opentracing-go, opentracing-java等),都必須實現一個空的Tracer,No-op Tracer的實現必須不會出錯,而且不會有任何反作用。這樣在業務方沒有指定collector服務、storage、和初始化全局tracer時,可是rpc組件,orm組件或者其餘組件加入了探針。這樣全局默認是No-op Tracer實例,則對業務不會有任何影響。
Jaeger can be deployed either as all-in-one binary, where all Jaeger backend components run in a single process, or as a scalable distributed system, discussed below. There two main deployment options:
Collectors are writing directly to storage.
Collectors are writing to Kafka as a preliminary buffer.
Illustration of direct-to-storage architecture
Illustration of architecture with Kafka as intermediate buffer
An instrumented service creates spans when receiving new requests and attaches context information (trace id, span id, and baggage) to outgoing requests. Only ids and baggage are propagated with requests; all other information that compose a span like operation name, logs, etc. are not propagated. Instead sampled spans are transmitted out of process asynchronously, in the background, to Jaeger Agents.
The instrumentation has very little overhead, and is designed to be always enabled in production.
Note that while all traces are generated, only a few are sampled. Sampling a trace marks the trace for further processing and storage. By default, Jaeger client samples 0.1% of traces (1 in 1000), and has the ability to retrieve sampling strategies from the agent.
Agent
The Jaeger agent is a network daemon that listens for spans sent over UDP, which it batches and sends to the collector. It is designed to be deployed to all hosts as an infrastructure component. The agent abstracts the routing and discovery of the collectors away from the client.
Collector
The Jaeger collector receives traces from Jaeger agents and runs them through a processing pipeline. Currently our pipeline validates traces, indexes them, performs any transformations, and finally stores them.
Jaeger’s storage is a pluggable component which currently supports Cassandra, Elasticsearch and Kafka.
Query
Query is a service that retrieves traces from storage and hosts a UI to display them.
Ingester
Ingester is a service that reads from Kafka topic and writes to another storage backend (Cassandra, Elasticsearch).
Jaeger client libraries expect jaeger-agent process to run locally on each host.
It can be executed directly on the host or via Docker, as follows:
## make sure to expose only the ports you use in your deployment scenario! docker run \ --rm \ -p6831:6831/udp \ -p6832:6832/udp \ -p5778:5778/tcp \ -p5775:5775/udp \ jaegertracing/jaeger-agent:1.12
The agents can connect point to point to a single collector address, which could be load balanced by another infrastructure component (e.g. DNS) across multiple collectors. The agent can also be configured with a static list of collector addresses.
On Docker, a command like the following can be used:
docker run \ --rm \ -p5775:5775/udp \ -p6831:6831/udp \ -p6832:6832/udp \ -p5778:5778/tcp \ jaegertracing/jaeger-agent:1.12 \ --reporter.grpc.host-port=jaeger-collector.jaeger-infra.svc:14250
When using gRPC, you have several options for load balancing and name resolution:
The collectors are stateless and thus many instances of jaeger-collector can be run in parallel. Collectors require almost no configuration, except for the location of Cassandra cluster, via --cassandra.keyspace and --cassandra.servers options, or the location of Elasticsearch cluster, via --es.server-urls, depending on which storage is specified. To see all command line options run
go run ./cmd/collector/main.go -h
or, if you don’t have the source code
docker run -it --rm jaegertracing/jaeger-collector:1.12 -h
Collectors require a persistent storage backend. Cassandra and Elasticsearch are the primary supported storage backends.
The storage type can be passed via SPAN_STORAGE_TYPE environment variable. Valid values are cassandra, elasticsearch, kafka (only as a buffer), grpc-plugin, badger (only with all-in-one) and memory (only with all-in-one).
Supported in Jaeger since 0.6.0 Supported versions: 5.x, 6.x
Elasticsearch does not require initialization other than installing and running Elasticsearch. Once it is running, pass the correct configuration values to the Jaeger collector and query service.
Configuration
Minimal
docker run \ -e SPAN_STORAGE_TYPE=elasticsearch \ -e ES_SERVER_URLS=<...> \ jaegertracing/jaeger-collector:1.12
To view the full list of configuration options, you can run the following command:
docker run \ -e SPAN_STORAGE_TYPE=elasticsearch \ jaegertracing/jaeger-collector:1.12 \ --help
一個微服務框架包括兩個部分,http(gin) & grpc兩部分,對外提供rest,對內提供grpc服務。
微服務軟件框架圖:
如下是微服務框架接入opentracing的大概流程。
爲每一個http請求建立一個tracer
tracer, closer := tracing.Init("hello-world") defer closer.Close() opentracing.SetGlobalTracer(tracer)
建立Span,若是http header中有trace和span信息,則從頭部獲取,不然建立新的。
spanCtx, _ := tracer.Extract(opentracing.HTTPHeaders, opentracing.HTTPHeadersCarrier(r.Header)) span := tracer.StartSpan("format", ext.RPCServerOption(spanCtx)) defer span.Finish() // 把span寫入context,函數見的內部調用須要傳遞ctx,或者說span之間須要傳遞ctx。 ctx := opentracing.ContextWithSpan(context.Background(), span)
Http/GRPC 服務函數的進程內部
span, _ := opentracing.StartSpanFromContext(ctx, "formatString") defer span.Finish() // 跨進程調用,如調用一個rest api,則須要把span信息注入http header中。 // tracing.InjectToHeaders(ctx, "GET", url, req.Header) func InjectToHeaders(ctx context.Context, method string, url string, header http.Header) { span := opentracing.SpanFromContext(ctx) if span != nil { ext.SpanKindRPCClient.Set(span) ext.HTTPUrl.Set(span, url) ext.HTTPMethod.Set(span, "GET") span.Tracer().Inject( span.Context(), opentracing.HTTPHeaders, opentracing.HTTPHeadersCarrier(header), ) } } span.LogFields( log.String("event", "string-format"), log.String("value", helloStr), )
router 埋點
在每一個須要追蹤請求的http路由方法上,添加「tracing.NewSpan」函數。
import ".../go_common/tracing" ... authorized := r.Group("/v1") authorized.Use(handlers.TokenCheck, handlers.MustLogin()) { authorized.GET("/user/:id", handlers.GetUserInfo) authorized.GET("/user", handlers.GetUserInfoByToken) authorized.PUT("/user/:id", tracing.NewSpan("put /user/:id", "handlers.Setting", false), handlers.Setting) }
參數說明
NewSpan(service string, operationName string, abortOnErrors bool, opts ...opentracing.StartSpanOption)
service generally fill with the endpoint of api.
operationName can be filled with HandleFunc's name.
Handler 函數埋點
func Setting(c *gin.Context) { ... // 從gin context中獲取span;必須埋點! span, found := tracing.GetSpan(c) //添加tag和log if found == true && span != nil { span.SetTag("req", req) span.LogFields( log.Object("uid", uid), ) } // opentracing.ContextWithSpan,將span和context綁定;在handler函數中,這個地方也是必須埋點的。 ctx, cancel := context.WithTimeout(opentracing.ContextWithSpan(context.Background(), span), time.Second*3) defer cancel() // call by grpc;這塊不須要特殊處理 auth := passportpb.Authentication{ LoginToken: c.GetHeader("Qsc-Peduli-Token"), } cli, _ := passportpb.Dial(ctx, grpc.WithPerRPCCredentials(&auth)) reply, err := cli.Setting(ctx, req) // directly call by local rpc method;這塊不須要特殊處理 ctx = metadata.AppendToOutgoingContext(ctx, "logintoken", c.GetHeader("Qsc-Peduli-Token")) reply, err := rpc.Srv.Setting(ctx, req) ... }
grpc 客戶端SDK
import "github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing" ... // Dial grpc server func (c *Client) Dial(serviceName string, opts ...grpc.DialOption) (*grpc.ClientConn, error) { ... unaryInterceptor := grpc_middleware.ChainUnaryClient( grpc_opentracing.UnaryClientInterceptor(), ) c.Dialopts = append(c.Dialopts, grpc.WithUnaryInterceptor(unaryInterceptor)) conn, err := grpc.Dial(serviceName, c.Dialopts...) if err != nil { return nil, fmt.Errorf("Failed to dial %s: %v", serviceName, err) } return conn, nil }
grpc 服務端SDK
import "github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing" ... func NewServer(serviceName, addr string) *Server { var opts []grpc.ServerOption opts = append(opts, grpc_middleware.WithUnaryServerChain( grpc_opentracing.UnaryServerInterceptor(), )) srv := grpc.NewServer(opts...) return &Server{ serviceName: serviceName, addr: addr, grpcServer: srv, } }
RPC 函數埋點
func (s *Service) Setting(ctx context.Context, req *passportpb.UserSettingRequest) (*passportpb.UserSettingReply, error) { // 若是不是grpc調用,即本地rpc函數調用方式,則從上下文中提取span。 if !s.meta.IsGrpcRequest(ctx) { span, _ := opentracing.StartSpanFromContext(ctx, "rpc.srv.Setting") defer span.Finish() } // 若是在rpc函數中,存在請求其它grpc函數,則正常調用便可,由於在grpc的請求上下文中已經有了trace和span信息,直接繁殖就行,無需額外操做。 reqVerify := new(passportpb.VerifyRequest) reqVerify.UID = req.UserID cli, _ := passportpb.Dial(ctx) replyV, _ := cli.Verify(ctx, reqVerify) }
Jaeger UI 最終效果