聊聊flink的AscendingTimestampExtractor

本文主要研究一下flink的AscendingTimestampExtractorhtml

AscendingTimestampExtractor

flink-streaming-java_2.11-1.7.0-sources.jar!/org/apache/flink/streaming/api/functions/timestamps/AscendingTimestampExtractor.javajava

/**
 * A timestamp assigner and watermark generator for streams where timestamps are monotonously
 * ascending. In this case, the local watermarks for the streams are easy to generate, because
 * they strictly follow the timestamps.
 *
 * @param <T> The type of the elements that this function can extract timestamps from
 */
@PublicEvolving
public abstract class AscendingTimestampExtractor<T> implements AssignerWithPeriodicWatermarks<T> {

    private static final long serialVersionUID = 1L;

    /** The current timestamp. */
    private long currentTimestamp = Long.MIN_VALUE;

    /** Handler that is called when timestamp monotony is violated. */
    private MonotonyViolationHandler violationHandler = new LoggingHandler();


    /**
     * Extracts the timestamp from the given element. The timestamp must be monotonically increasing.
     *
     * @param element The element that the timestamp is extracted from.
     * @return The new timestamp.
     */
    public abstract long extractAscendingTimestamp(T element);

    /**
     * Sets the handler for violations to the ascending timestamp order.
     *
     * @param handler The violation handler to use.
     * @return This extractor.
     */
    public AscendingTimestampExtractor<T> withViolationHandler(MonotonyViolationHandler handler) {
        this.violationHandler = requireNonNull(handler);
        return this;
    }

    // ------------------------------------------------------------------------

    @Override
    public final long extractTimestamp(T element, long elementPrevTimestamp) {
        final long newTimestamp = extractAscendingTimestamp(element);
        if (newTimestamp >= this.currentTimestamp) {
            this.currentTimestamp = newTimestamp;
            return newTimestamp;
        } else {
            violationHandler.handleViolation(newTimestamp, this.currentTimestamp);
            return newTimestamp;
        }
    }

    @Override
    public final Watermark getCurrentWatermark() {
        return new Watermark(currentTimestamp == Long.MIN_VALUE ? Long.MIN_VALUE : currentTimestamp - 1);
    }

    //......
}
  • AscendingTimestampExtractor抽象類實現AssignerWithPeriodicWatermarks接口的extractTimestamp及getCurrentWatermark方法,同時聲明抽象方法extractAscendingTimestamp供子類實現
  • AscendingTimestampExtractor適用於elements的時間在每一個parallel task裏頭是單調遞增(timestamp monotony)的場景,extractTimestamp這裏先是調用子類實現的extractAscendingTimestamp方法從element提取newTimestamp,而後返回,對於違反timestamp monotony的,這裏調用MonotonyViolationHandler進行處理
  • getCurrentWatermark方法在currentTimestamp不爲Long.MIN_VALUE時返回Watermark(currentTimestamp - 1)

MonotonyViolationHandler

flink-streaming-java_2.11-1.7.0-sources.jar!/org/apache/flink/streaming/api/functions/timestamps/AscendingTimestampExtractor.javaapache

/**
     * Interface for handlers that handle violations of the monotonous ascending timestamps
     * property.
     */
    public interface MonotonyViolationHandler extends java.io.Serializable {

        /**
         * Called when the property of monotonously ascending timestamps is violated, i.e.,
         * when {@code elementTimestamp < lastTimestamp}.
         *
         * @param elementTimestamp The timestamp of the current element.
         * @param lastTimestamp The last timestamp.
         */
        void handleViolation(long elementTimestamp, long lastTimestamp);
    }

    /**
     * Handler that does nothing when timestamp monotony is violated.
     */
    public static final class IgnoringHandler implements MonotonyViolationHandler {
        private static final long serialVersionUID = 1L;

        @Override
        public void handleViolation(long elementTimestamp, long lastTimestamp) {}
    }

    /**
     * Handler that fails the program when timestamp monotony is violated.
     */
    public static final class FailingHandler implements MonotonyViolationHandler {
        private static final long serialVersionUID = 1L;

        @Override
        public void handleViolation(long elementTimestamp, long lastTimestamp) {
            throw new RuntimeException("Ascending timestamps condition violated. Element timestamp "
                    + elementTimestamp + " is smaller than last timestamp " + lastTimestamp);
        }
    }

    /**
     * Handler that only logs violations of timestamp monotony, on WARN log level.
     */
    public static final class LoggingHandler implements MonotonyViolationHandler {
        private static final long serialVersionUID = 1L;

        private static final Logger LOG = LoggerFactory.getLogger(AscendingTimestampExtractor.class);

        @Override
        public void handleViolation(long elementTimestamp, long lastTimestamp) {
            LOG.warn("Timestamp monotony violated: {} < {}", elementTimestamp, lastTimestamp);
        }
    }
  • MonotonyViolationHandler繼承了Serializable,它定義了handleViolation方法,這個接口內置有三個實現類,分別是IgnoringHandler、FailingHandler、FailingHandler
  • IgnoringHandler的handleViolation方法不作任何處理;FailingHandler的handleViolation會拋出RuntimeException;LoggingHandler的handleViolation方法會打印warn日誌
  • AscendingTimestampExtractor默認使用的是LoggingHandler,也能夠經過withViolationHandler方法來進行設置

實例

@Test
    public void testWithFailingHandler() {
        AscendingTimestampExtractor<Long> extractor = (new AscendingTimestampExtractorTest.LongExtractor()).withViolationHandler(new FailingHandler());
        this.runValidTests(extractor);

        try {
            this.runInvalidTest(extractor);
            Assert.fail("should fail with an exception");
        } catch (Exception var3) {
            ;
        }

    }

    private void runValidTests(AscendingTimestampExtractor<Long> extractor) {
        Assert.assertEquals(13L, extractor.extractTimestamp(13L, -1L));
        Assert.assertEquals(13L, extractor.extractTimestamp(13L, 0L));
        Assert.assertEquals(14L, extractor.extractTimestamp(14L, 0L));
        Assert.assertEquals(20L, extractor.extractTimestamp(20L, 0L));
        Assert.assertEquals(20L, extractor.extractTimestamp(20L, 0L));
        Assert.assertEquals(20L, extractor.extractTimestamp(20L, 0L));
        Assert.assertEquals(500L, extractor.extractTimestamp(500L, 0L));
        Assert.assertEquals(9223372036854775806L, extractor.extractTimestamp(9223372036854775806L, 99999L));
    }

    private void runInvalidTest(AscendingTimestampExtractor<Long> extractor) {
        Assert.assertEquals(1000L, extractor.extractTimestamp(1000L, 100L));
        Assert.assertEquals(1000L, extractor.extractTimestamp(1000L, 100L));
        Assert.assertEquals(999L, extractor.extractTimestamp(999L, 100L));
    }

    private static class LongExtractor extends AscendingTimestampExtractor<Long> {
        private static final long serialVersionUID = 1L;

        private LongExtractor() {
        }

        public long extractAscendingTimestamp(Long element) {
            return element;
        }
    }
  • 這裏使用withViolationHandler設置了violationHandler爲FailingHandler,在遇到999這個時間的時候,因爲比以前的1000小,於是會調用MonotonyViolationHandler.handleViolation方法

小結

  • flink爲了方便開發提供了幾個內置的Pre-defined Timestamp Extractors / Watermark Emitters,其中一個就是AscendingTimestampExtractor
  • AscendingTimestampExtractor抽象類實現AssignerWithPeriodicWatermarks接口的extractTimestamp及getCurrentWatermark方法,同時聲明抽象方法extractAscendingTimestamp供子類實現
  • AscendingTimestampExtractor適用於elements的時間在每一個parallel task裏頭是單調遞增的,對於違反timestamp monotony的,這裏調用MonotonyViolationHandler的handleViolation方法進行處理;MonotonyViolationHandler繼承了Serializable,它定義了handleViolation方法,這個接口內置有三個實現類,分別是IgnoringHandler、FailingHandler、FailingHandler

doc

相關文章
相關標籤/搜索