Telegraf & Kapacitor, 來自Influxdata的套路

時間 2019-12-09

標籤 telegraf kapacitor 來自 influxdata 套路简体版

原文原文鏈接

Telegraf & Kapacitor, 來自Influxdata的套路

InfluxDB推出了的正式版V1.0版本(最新版本爲1.1)，隨之而來還有Telegraf、Chronograf、Kapacitor等多個產品。InfluxDB也推出了企業版，並推出了InfluxCloud的雲服務，這陣勢，是要承包指標採集、分析、畫圖等時序數據庫上下游的生意，有點模仿ELK套件的意思，今天咱們就來講一下這裏面的套路。php

Telegraf

Telegraf是一個數據採集套件，使用起來跟Collectd、Statsd、Logstash等軟件很像。經過plugin來實現數據的input和output。mysql

看着下面長長的一串plugin，感受很是強大有沒有。ios

Input Plugins

aws cloudwatchnginx
aerospikegit
apachegithub
bcacheweb
cassandraredis
cephsql
chronymongodb
consul
conntrack
couchbase
couchdb
disque
dns query time
docker
dovecot
elasticsearch
exec (generic executable plugin, support JSON, influx, graphite and nagios)
filestat
haproxy
hddtemp
http_response
httpjson (generic JSON-emitting http service plugin)
influxdb
ipmi_sensor
iptables
jolokia
leofs
lustre2
mailchimp
memcached
mesos
mongodb
mysql
net_response
nginx
nsq
nstat
ntpq
phpfpm
phusion passenger
ping
postgresql
postgresql_extensible
powerdns
procstat
prometheus
puppetagent
rabbitmq
raindrops
redis
rethinkdb
riak
sensors
snmp
snmp_legacy
sql server (microsoft)
twemproxy
varnish
zfs
zookeeper
win_perf_counters (windows performance counters)
sysstat
system
cpu
mem
net
netstat
disk
diskio
swap
processes
kernel (/proc/stat)
kernel (/proc/vmstat)

Service plugins:

http_listener
kafka_consumer
mqtt_consumer
nats_consumer
nsq_consumer
logparser
statsd
tail
tcp_listener
udp_listener
webhooks
filestack
github
mandrill
rollbar

Processor Plugins

printer

Aggregator Plugins

minmax

Output Plugins

influxdb
amon
amqp
aws kinesis
aws cloudwatch
datadog
file
graphite
graylog
instrumental
kafka
librato
mqtt
nats
nsq
opentsdb
prometheus
riemann

安裝與使用

安裝

筆者使用了最簡單的RPM方式安裝

curl -LO https://dl.influxdata.com/telegraf/releases/telegraf-1.1.1.x86_64.rpm
rpm ivh telegraf-1.1.1.x86_64.rpm

啓動

standalone方式啓動，能夠指定須要的plugin，是否是有點像logstash呢？

telegraf --config telegraf.conf -input-filter cpu:mem -output-filter influxdb

或者之後臺Deamon方式啓動。
首先編輯默認的配置文件

vim /etc/telegraf/telegraf.conf

設置好plugin以後，就能夠啓動服務了

service telegraf start

Chronograf

Chronograf是一款畫圖軟件，總體風格與Grafana十分類似，來貼幾張界面的圖

固然比起已經比較成熟的Grafana來講，還差一些。

Kapacitor

Kapacitor是一款時序數據分析、處理的軟件。能夠週期性將InfluxDB中的數據彙總、處理後再輸出到InfluxDB當中，或者告警（支持Email、HTTP、TCP、 HipChat, OpsGenie, Alerta, Sensu, PagerDuty, Slack等多種方式）

Kapacitor的配置可使用一種叫作 TICKscript 的DSL語言來書寫，下面是一個配置的例子

stream
    |from()
        .measurement('cpu_usage_idle')
        .groupBy('host')
    |window()
        .period(1m)
        .every(1m)
    |mean('value')
    |eval(lambda: 100.0 - "mean")
        .as('used')
    |alert()
        .message('{{ .Level}}: {{ .Name }}/{{ index .Tags "host" }} has high cpu usage: {{ index .Fields "used" }}')
        .warn(lambda: "used" > 70.0)
        .crit(lambda: "used" > 85.0)

        // Send alert to hander of choice.

        // Slack
        .slack()
        .channel('#alerts')

        // VictorOps
        .victorOps()
        .routingKey('team_rocket')

        // PagerDuty
        .pagerDuty()

咱們就以上例來說解配置

stream

stream,一段配置的開始，至關於function,也能夠用來賦值

var errors = stream
    |from()
        .measurement('errors')

from

from, 定義數據的來源，也可用QUERY來直接書寫SQL

|query('''
         SELECT mean("value")
         FROM "telegraf"."default".cpu_usage_idle
         WHERE "host" = 'serverA'
     ''')

windows

window, 定義時序數據的時間範圍

|window()
        .period(10m)
        .every(1m)

表示每分鐘執行一次，取10分鐘內的指標，也可使用cron來指定執行時間

mean

mean表示取中位數，固然還有Derivative(增值)、Difference（差值）等不少方法，用過Grafana的同窗應該比較熟悉

eval

eval能夠經過自定義的函數對數據進行加工，這裏將100 - "mean"的結果定義爲used

|eval(lambda: 100.0 - "mean")
        .as('used')

alert

alert就是告警的方法了。

id、message分別是告警的標題和內容

能夠設置多個級別的告警閥值（OK、INFO、WARNING、CRITICAL）

stream
           .groupBy('service')
       |alert()
           .id('kapacitor/{{ index .Tags "service" }}')
           .message('{{ .ID }} is {{ .Level }} value:{{ index .Fields "value" }}')
           .info(lambda: "value" > 10)
           .warn(lambda: "value" > 20)
           .crit(lambda: "value" > 30)
           .post("http://example.com/api/alert")
           .post("http://another.example.com/api/alert")
           .tcp("exampleendpoint.com:5678")
           .email('oncall@example.com')

最後就是設置報警的方式了，部分參數能夠配置在啓動配置文件中

啓動

寫完TICKscript後，將之保存到文件，例如cpu_alert.tick
以後用如下命令啓動程序

# Define the task (assumes cpu data is in db 'telegraf')
kapacitor define \
    cpu_alert \
    -type stream \
    -dbrp telegraf.default \
    -tick ./cpu_alert.tick
# Start the task
kapacitor enable cpu_alert

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。