Log system architecture

0. 技術選型參考

Architecture

1. Collector

Keywords: Collector, Processorhtml

名稱 Beats Fluentd-bit
Introduction Beats are a collector and processor of lightweight (resource efficient, no dependencies, small) and open source log shippers that act as agents installed on the different servers in your infrastructure for collecting logs or metrics. Fluent Bit was born to address the need for a high performance and optimized tool that can collect and process data from any input source, unify that data and deliver it to multiple destinations.
Owner Elastic Treasure Data
Open Source True True
Github Stars 5742 608
License Apache License v2.0 Apache License v2.0
Scope Containers / Servers / K8S Containers / Servers / K8S
Language Go C
Memory ~10MB ~500KB
Performance High High
Dependencies Zero dependencies, unless some special plugin requires them. Zero dependencies, unless some special plugin requires them.
Category Auditbeat,Filebeat,Heartbeat,Metricbeat,Packetbeat,Winlogbeat NaN
Configuration File(.yml)/Cmd File(custom file extension and syntax)/Cmd
Essence Collector & Processor Collector & Processor
Input/Module File, Docker, Syslog, Nginx, Mysql, Postgresql, etc File,CPU, Disk, Docker, Syslog, etc
Output Elasticsearch, Logstash, Kafka, Redis, File, Console ES, File, Kafka, etc

1.1 Filebeat 架構圖

official filebeat

Original filebeat

Revolutional filebeat

  1. Ingest Node - A es plugin which pre-process documents before the actual document indexing happen and replace for Logstash. The ingest node intercepts bulk and index requests, it applies transformations, and it then passes the documents back to the index or bulk APIs. Define a pipeline(Processors) that specifies a series of processors, then register the pipeline id in Filebeat configuration file.
  2. Kafka - Prevent loss of data and manage logging output speed.

1.2 Fluent bit 架構圖

Logging pipeline

Name Description Samples
Input Entry point of data. Implemented through Input Plugins, this interface allows to gather or receive data. Samples
Parser Parsers allow to convert unstructured data gathered from the Input interface into a structured one. Parsers are optional and depends on Input plugins. Prospector and processors in Filebeat
Filter The filtering mechanism allows to alter the data ingested by the Input plugins. Filters are implemented as plugins. Prospector and processors in Filebeat
Buffer By default, the data ingested by the Input plugins, resides in memory until is routed and delivered to an Output interface.
Routing Data ingested by an Input interface is tagged, that means that a Tag is assigned and this one is used to determinate where the data should be routed based on a match rule.
Output An output defines a destination for the data. Destinations are handled by output plugins. Note that thanks to the Routing interface, the data can be delivered to multiple destinations. Samples

2. Log Transporter

Keywords: Collector, Processor, Aggregatorjava

名稱 Logstah Fluentd
Introduction Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your stash. Fluentd is an open source data collector, which lets you unify the data.
Owner Elastic Treasure Data
Open Source True True
Github Stars 9105 6489
License Apache License v2.0 Apache License v2.0
Scope Containers / Servers / K8S Containers / Servers / K8S
Language JRuby(JVM) Ruby & C
Memory 200MB+ ~40MB
Performance Middle High
Dependencies JVM Ruby Gem
Configuration File(custom file extension and syntax)/Cmd File(custom file extension and syntax)/Cmd
Essence Collector, Processor, Aggregator CCollector, Processor, Aggregator
Input/Module Limited only by your imagination(Serilog) Limited only by your imagination(Nlog)
Output Limited only by your imagination Limited only by your imagination

Further Reading: Fluentd vs. Logstash: A Comparison of Log Collectorsnode

3. 初步總結

比較 Beats + Logstash Fluentd bit + Fluentd 說明
功能實現 基本一致
安裝與配置簡易性
內存佔用 JVM 特性使然
可靠性 前者使用 registry file + redis 實現可靠性,後者使用內置 buffering 實現可靠性
可擴展性 插件生態和可擴展性基本一致。後者爲分佈型插件管理
趨勢 ELK -> EFK
其餘 前者更傾向於使用 go & java 技術棧,後者有 docker, k8s 官方 log driver 類型和案例支持

Tips: 任一層級均可以自由替換.redis

4. Visualizer

Keywords: Query, Analyze, Monitorsql

名稱 Kibana Grafana
Introduction Kibana is an open source data visualization plugin for Elasticsearch. Data visualization & Monitoring with support for Graphite, InfluxDB, Prometheus, Elasticsearch and many more databases.The leading open source software for time series analytics.
Owner Elastic Grafana
Open Source True True
Github Stars 9k+ 22k+
License Apache License v2.0 Apache License v2.0
Scope ElasticSearch only ElasticSearch, InfluxDB, PostgreSQL etc
Language Javascript Go & Typescript
Configuration File(.yml)/Cmd File(custom file extension and syntax)/Cmd
Simple Query Lucene syntax and filter components filter components.Different from each other data source
Full-Text Query Yes No
Security Plugins or libraries Integration
Notification Plugins or libraries Integration
Advantages Log, ES Multiple data source, APM, Timeseries

Working together.docker

5. Log Storage and Analyzer

Keywords:Storage, ES, Postgresql, Zombodb, Arangodb安全

5.1 ElasticSearch

  1. 同時支持單文檔的對象搜索+模糊搜索+全文搜索
  2. Skywalking 官方支持存儲媒介
  3. 做爲流行 Output 支持絕大部分 Log 相關係統
  4. 天生分佈式
  5. 一鍵設置過時窗口,索引重建
  6. ……

  1. 佔用資源較多,對存儲介質要求高
  2. 運維成本更高
  3. 持久化
  4. 安全性 - Search Guard
  5. ……

6. 總結

  1. Sinks(Log sinks, Beats, Fluentd-bit) -> Storages(ElasticSearch, Postgresql,Zombodb etc).
  2. Collctors(Beats, Fluentd-bit) -> Kafka -> Fluentd -> Storages(ElasticSearch, Postgresql,Zombodb etc).

7. 擴展

APM
Skywalking architecture

相關文章
相關標籤/搜索