Metrics-服務指標度量

簡介

Metrics做爲一款監控指標的度量類庫,提供了不少模塊能夠爲第三方庫或者應用提供輔助統計信息。Metrics內部提供了Gauge、Counter、Meter、Histogram、Timer等度量工具類以及Health Check功能。java

Maven配置

metrics-core爲metrics核心庫,定義了各類指標項,須要在pom.xml引用。git

<dependencies>
    <dependency>
        <groupId>io.dropwizard.metrics</groupId>
        <artifactId>metrics-core</artifactId>
        <version>3.1.0</version>
    </dependency>
</dependencies>
複製代碼

MetricRegistry

MetricRegistry類是核心容器,內部使用ConcurrentHashMap維護全部監控指標項。 指標註冊核心代碼:github

public <T extends Metric> T register(String name, T metric) throws IllegalArgumentException {
        if (metric instanceof MetricSet) {
            registerAll(name, (MetricSet) metric);
        } else {
            final Metric existing = metrics.putIfAbsent(name, metric);
            if (existing == null) {
                onMetricAdded(name, metric);
            } else {
                throw new IllegalArgumentException("A metric named " + name + " already exists");
            }
        }
        return metric;
    }
複製代碼

每一個指標項都須要有個獨一無二的名字,MetricRegistry類提供了名字生成的方式。除了能夠根據類名來生成名字外,也支持自定義名字。其本質是字符串的拼接。算法

public static String name(String name, String... names) {
        final StringBuilder builder = new StringBuilder();
        append(builder, name);
        if (names != null) {
            for (String s : names) {
                append(builder, s);
            }
        }
        return builder.toString();
    }

    public static String name(Class<?> klass, String... names) {
        return name(klass.getName(), names);
    }

    private static void append(StringBuilder builder, String part) {
        if (part != null && !part.isEmpty()) {
            if (builder.length() > 0) {
                builder.append('.');
            }
            builder.append(part);
        }
    }
複製代碼

Metrics數據展現

Metrics提供了Reporter接口,用於展現內部的數據指標信息。metrics-core中主要實現了ConsoleReporter、CsvReporter 、Slf4jReporter、JmxReporter。在本文例子中使用ConsoleReporter展現內部指標。bash

對於使用Falcon監控系統的公司,能夠參照ConsoleReporter實現自定義的Reporter,這樣Metrics就能夠無縫集成到公司的監控系統上。app

Metrics度量指標

Gauge

Gauge主要記錄指標的瞬時值,如服務當前Jvm使用狀況等;dom

public class JvmGaugeTest {

    public static void main(String[] args) throws Exception {
        MetricRegistry registry = new MetricRegistry();

        ConsoleReporter reporter = ConsoleReporter.forRegistry(registry).build();
        reporter.start(1, TimeUnit.SECONDS);

        MemoryMXBean memoryMXBean = ManagementFactory.getMemoryMXBean();
        registry.register("jvm.total.used",
                new Gauge<Long>() {

                    @Override
                    public Long getValue() {
                        return memoryMXBean.getHeapMemoryUsage().getUsed()
                                + memoryMXBean.getNonHeapMemoryUsage().getUsed();
                    }
                });

        while (true) {
            Thread.sleep(1000);
        }
    }
}
複製代碼

代碼運行結果以下:jvm

-- Gauges ----------------------------------------------------------------------
jvm.total.used
             value = 16314496
複製代碼

Counter

Counter是計數器,能夠對Counter進行增長和減小操做,維護累計的指標。ide

public class CounterTest {

    private static Queue<String> queue = new LinkedBlockingQueue<String>();
    private static Counter pendingJobs;
    private static Random random = new Random();

    public static void addJob(String job) {
        pendingJobs.inc();
        queue.offer(job);
    }

    public static String takeJob() {
        pendingJobs.dec();
        return queue.poll();
    }

    public static void main(String[] args) throws InterruptedException {
        MetricRegistry registry = new MetricRegistry();

        ConsoleReporter reporter = ConsoleReporter.forRegistry(registry).build();
        reporter.start(1, TimeUnit.SECONDS);

        pendingJobs = registry.counter("pending.jobs.size");
        for (int num = 1; ; num++) {
            Thread.sleep(100);
            if (random.nextDouble() > 0.8) {
                takeJob();
            } else {
                addJob("Job-" + num);
            }
        }
    }
}
複製代碼

代碼運行結果以下:工具

-- Counters --------------------------------------------------------------------
pending.jobs.size
             count = 19
複製代碼

Meter

Meter度量事件發生的頻率,統計最近1分鐘、5分鐘、15分鐘的速率。

public class MeterTest {

    private static Random random = new Random();

    public static void request(Meter meter, int times) {
        for (int i = 0; i < times; i++) {
            meter.mark();
        }
    }

    public static void main(String[] args) throws InterruptedException {
        MetricRegistry registry = new MetricRegistry();

        ConsoleReporter reporter = ConsoleReporter.forRegistry(registry).build();
        reporter.start(1, TimeUnit.SECONDS);

        Meter meterTps = registry.meter("request.tps");
        while (true) {
            request(meterTps, random.nextInt(10));
            Thread.sleep(1000);
        }
    }

}
複製代碼

代碼運行結果以下:

-- Meters ----------------------------------------------------------------------
request.tps
             count = 115
         mean rate = 5.00 events/second
     1-minute rate = 7.04 events/second
     5-minute rate = 7.63 events/second
    15-minute rate = 7.74 events/second
複製代碼

Meter參考UNIX系統關於平均負荷load average來設計的,其中使用到了EMA 指數移動平均算法。越近期的數據加權影響力越重。

Histogram

Histogram統計數據分佈狀況,統計最小值、最大值、平均值、中位數、75分位、90分位、95分位、99分位、99.9分位等數據。

public class HistogramTest {

    private static Random random = new Random();

    public static void main(String[] args) throws Exception {
        MetricRegistry registry = new MetricRegistry();

        ConsoleReporter reporter = ConsoleReporter.forRegistry(registry).build();
        reporter.start(1, TimeUnit.SECONDS);

        Histogram histogram = new Histogram(new UniformReservoir());
        registry.register("request.histogram", histogram);

        while (true) {
            Thread.sleep(1000);
            histogram.update(random.nextInt(100));
        }
    }

}
複製代碼

代碼運行結果以下:

-- Histograms ------------------------------------------------------------------
request.histogram
             count = 18
               min = 10
               max = 98
              mean = 53.28
            stddev = 29.48
            median = 44.50
              75% <= 83.50
              95% <= 98.00
              98% <= 98.00
              99% <= 98.00
            99.9% <= 98.00
複製代碼
數據抽樣

Histogram須要統計數據分佈,其內部必須抽樣維護數據信息。內置的數據抽樣有如下幾種實現:

  • ExponentiallyDecayingReservoir:基於指數級別的抽樣算法,根據更新時間與開始時間的差值轉化爲權重值,權重越大數據被保留的概率越大。
  • UniformReservoir:隨機抽樣,隨着更新次數的增長,數據被抽樣的機率會減小。
  • SlidingWindowReservoir:滑動窗口抽樣,老是保留最新的統計數據。
  • SlidingTimeWindowReservoir:滑動時間窗口抽樣,老是保留最近時間段的統計數據。

注意事項

若使用ExponentiallyDecayingReservoir和SlidingTimeWindowReservoir,須要注意容量,底層並不會限制容量大小。若服務流量大,可能會佔用不少內存。

Timer

Timer是Histogram和Meter的結合,Histogram統計耗時分佈,Meter統計QPS;

public class TimerTest {

    public static Random random = new Random();

    private static void request() throws InterruptedException {
        Thread.sleep(random.nextInt(1000));
    }

    public static void main(String[] args) throws Exception {
        MetricRegistry registry = new MetricRegistry();

        ConsoleReporter reporter = ConsoleReporter.forRegistry(registry).build();
        reporter.start(1, TimeUnit.SECONDS);

        Timer timer = registry.timer("request.latency");
        Timer.Context ctx;
        while (true) {
            ctx = timer.time();
            request();
            ctx.stop();
        }
    }
}
複製代碼

代碼運行結果以下:

-- Timers ----------------------------------------------------------------------
request.latency
             count = 22
         mean rate = 2.00 calls/second
     1-minute rate = 1.98 calls/second
     5-minute rate = 2.00 calls/second
    15-minute rate = 2.00 calls/second
               min = 148.93 milliseconds
               max = 865.65 milliseconds
              mean = 491.89 milliseconds
            stddev = 219.59 milliseconds
            median = 465.60 milliseconds
              75% <= 671.65 milliseconds
              95% <= 850.28 milliseconds
              98% <= 865.65 milliseconds
              99% <= 865.65 milliseconds
            99.9% <= 865.65 milliseconds
複製代碼

經驗總結

當咱們須要上報服務瞬時指標時會使用Guage,如Jvm的使用狀況。當咱們須要統計數據分佈時會使用Histogram,如接口的響應耗時分佈。當咱們須要統計頻率時會使用Meter,如某個接口的請求頻率。當咱們既須要統計頻率也須要統計分佈時會使用Timer對象,如某個接口的請求頻率及耗時狀況。

相關資料

Metrics Core

相關文章
相關標籤/搜索