Flink中的數據流編程模型

時間 2019-12-21

標籤 flink 數據流編程模型简体版

原文原文鏈接

[TOC]html

抽象層次（Levels of Abstraction）

Flink提供了開發流/批處理應用程序的不一樣抽象級別。redis

最低級別抽象只提供有狀態流（stateful streaming）。它經過Process函數嵌入到DataStream API中。它容許用戶自由處理來自一個或多個流的事件，並使用一致的容錯狀態。此外，用戶能夠註冊事件時間和處理時間回調，容許程序實現複雜的計算。
實際上，大多數應用程序不須要上面描述的低層抽象，而是根據核心API (DataStream API(有界/無界流)和DataSet API(有界數據集)進行編程。這些連貫api爲數據處理提供了常見的構建塊，好比用戶指定的各類形式的轉換、鏈接、聚合、窗口、狀態等。在這些api中處理的數據類型用各自的編程語言表示爲類。
Table API 是一個以數據表爲中心的聲明性DSL，表能夠(在表示流時)動態地更改表。Table API遵循(擴展的)關係模型:表有一個附加的模式(相似於關係數據庫中的表)，而API提供了相似的操做，如select、project、join、group-by、aggregate等。Table API程序聲明性地定義應該執行什麼邏輯操做，而不是確切地指定操做代碼的外觀。雖然Table API能夠經過各類類型的用戶定義的函數進行擴展，可是它比Core APIs更缺少表現力，可是使用起來更簡潔(編寫的代碼更少)。此外，Table API程序在執行以前還須要通過一個應用優化規則的優化器。
能夠在tables和DataStream/DataSet之間無縫轉換，容許Table API和DataStream/DataSet API混合使用。sql
Flink提供的最高級抽象是SQL。這種抽象在語義和表達性上都相似於Table API，但將程序表示爲SQL查詢表達式。SQL抽象與Table API緊密交互，SQL查詢能夠在Table API中定義的表上執行。

程序和數據流（Programs and Dataflows）

The basic building blocks of Flink programs are streams and transformations. (Note that the DataSets used in Flink’s DataSet API are also streams internally – more about that later.) Conceptually a stream is a (potentially never-ending) flow of data records, and a transformation is an operation that takes one or more streams as input, and produces one or more output streams as a result.

流（streams）和轉換（transformations）是構成Flink程序的基本組成。（須要注意的是，Flink的DataSet API也是基於流實現的——稍後會詳細介紹。）從概念上講，流是一種（可能永遠不會結束）流動的數據記錄，而轉換是一種操做，它接收一個或者多個流做爲輸入，而且生產出一個或多個輸出流做爲結果。數據庫

When executed, Flink programs are mapped to streaming dataflows, consisting of streams and transformation operators. Each dataflow starts with one or more sources and ends in one or more sinks. The dataflows resemble arbitrary directed acyclic graphs (DAGs). Although special forms of cycles are permitted via iteration constructs, for the most part we will gloss over this for simplicity.

當執行時，Flink程序被影射到流式數據流（streaming dataflows），由流和轉換操做組成。每一個數據流開始於一個或多個數據源（sources），並結束於一個或多個接收器（sinks）。數據流相似於任意有向無環圖（directed acyclic graphs DAGs）。<u>雖然經過迭代構造容許使用特殊形式的循環，可是爲了簡單起見，咱們將在大多數狀況下忽略這一點。</u>apache

Often there is a one-to-one correspondence between the transformations in the programs and the operators in the dataflow. Sometimes, however, one transformation may consist of multiple transformation operators.
Sources and sinks are documented in the streaming connectors and batch connectors docs. Transformations are documented in DataStream operators and DataSet transformations.編程

一般，程序中的轉換與數據流中的操做符之間存在一對一的對應關係。然而，有時一個轉換可能包含多個轉換操做符。windows

源（sources）和接收器（sinks）記錄在streaming connectors和batch connectors文檔中。轉換記錄在DataStream operators和DataSet transformations中。api

並行數據流（Parallel Dataflows）

Programs in Flink are inherently parallel and distributed. During execution, a stream has one or more stream partitions, and each operator has one or more operator subtasks. The operator subtasks are independent of one another, and execute in different threads and possibly on different machines or containers.
The number of operator subtasks is the parallelism of that particular operator. The parallelism of a stream is always that of its producing operator. Different operators of the same program may have different levels of parallelism.session

Flink程序本質上是並行和分佈式。在執行期間，一個流有一個或多個流分區（stream partitions），每一個操做符有一個或多個操做符子任務（operator subtasks）。操做符子任務彼此獨立，並在不一樣的線程中執行，甚至在不一樣的機器或容器上執行。數據結構

操做符子任務的數量爲該特定操做符的並行度（parallelism）。流的並行度老是取決於它的生產操做。同一程序的不一樣操做符可能具備不一樣級別的並行度。

Streams can transport data between two operators in a one-to-one (or forwarding) pattern, or in a redistributing pattern:

One-to-one streams (for example between the Source and the map() operators in the figure above) preserve the partitioning and ordering of the elements. That means that subtask[1] of the map() operator will see the same elements in the same order as they were produced by subtask[1] of the Source operator.

Redistributing streams (as between map() and keyBy/window above, as well as between keyBy/window and Sink) change the partitioning of streams. Each operator subtask sends data to different target subtasks, depending on the selected transformation. Examples are keyBy() (which re-partitions by hashing the key), broadcast(), or rebalance() (which re-partitions randomly). In a redistributing exchange the ordering among the elements is only preserved within each pair of sending and receiving subtasks (for example, subtask[1] of map() and subtask[2] of keyBy/window). So in this example, the ordering within each key is preserved, but the parallelism does introduce non-determinism regarding the order in which the aggregated results for different keys arrive at the sink.

Details about configuring and controlling parallelism can be found in the docs on parallel execution.

流能夠在兩個操做符之間以一對一 one to one（或轉發 forwarding）模式傳輸數據，也能夠採用從新分發 Redistributing 模式：

一對一流（例如上圖中的Source 和 map()操做符）保持數據元素的分區和順序。意味着map()[1]處理的數據與Source[1]生產的元素相同，且順序一致。
從新分發流 （如上圖中的 map() 和 keyBy/window，以及 keyBy/window 和 Sink）會改變流的分區。每一個操做符子任務根據所選的轉換向不一樣的目標子任務發送數據。例如keyBy()(經過散列鍵從新分區)、broadcast()或rebalance()(隨機從新分區)。在重分發交換中，元素之間的順序只保留在每對發送和接收子任務中(例如map()[1]和keyBy/window[2])。所以，在本例中，保留了每一個鍵的順序，可是並行性確實引入了關於不一樣鍵的聚合結果到達接收器的順序的非肯定性。

有關配置和控制並行性的詳細信息能夠在並行執行的文檔中找到。

窗口（Windows）

Aggregating events (e.g., counts, sums) works differently on streams than in batch processing. For example, it is impossible to count all elements in a stream, because streams are in general infinite (unbounded). Instead, aggregates on streams (counts, sums, etc), are scoped by windows, such as 「count over the last 5 minutes」, or 「sum of the last 100 elements」.
Windows can be time driven (example: every 30 seconds) or data driven (example: every 100 elements). One typically distinguishes different types of windows, such as tumbling windows (no overlap), sliding windows (with overlap), and session windows (punctuated by a gap of inactivity).

More window examples can be found in this blog post. More details are in the window docs.

聚合事件（如counts、sums）在流上的工做方式與在批處理中不一樣。例如，不可能計算流中的全部元素，由於流一般是無限的(無界的)。做爲替代，流上的聚合由窗口windows 限定範圍，例如」過去5分鐘內的數量「或「最後100個元素的總和」。

窗口能夠是時間驅動的（例如:每30秒），也能夠是數據驅動的（例如:每100個元素）。一個典型的例子是區分不一樣類型的窗口，好比翻滾窗口 tumbling windows（沒有重疊）、滑動窗口 sliding windows（有重疊）和會話窗口 session windows（中間有一個不活動的間隙）。

更多的窗口例子能夠在這篇博文中找到 blog post。更多細節在windows docs中。

時間（Time）

When referring to time in a streaming program (for example to define windows), one can refer to different notions of time:

Event Time is the time when an event was created. It is usually described by a timestamp in the events, for example attached by the producing sensor, or the producing service. Flink accesses event timestamps via timestamp assigners.

Ingestion time is the time when an event enters the Flink dataflow at the source operator.

Processing Time is the local time at each operator that performs a time-based operation.

More details on how to handle time are in the event time docs.

在流處理程序中提到時間（例如定義 windows）時，能夠指不一樣的時間概念：

事件時間（Event Time）是建立事件的時間。它一般由事件中的時間戳描述，例如由生產傳感器或生產服務附加的時間戳。Flink經過時間戳分配程序timestamp assigners訪問事件時間戳。
攝入時間（Ingestion time）是事件在源操做符處進入Flink數據流的時間。
處理時間（Processing Time）是指每一個基於時間的操做符執行的本地時間。

關於如何處理時間的更多細節在這裏 event time docs。

有狀態操做（Stateful Operations）

While many operations in a dataflow simply look at one individual event at a time (for example an event parser), some operations remember information across multiple events (for example window operators). These operations are called stateful.
The state of stateful operations is maintained in what can be thought of as an embedded key/value store. The state is partitioned and distributed strictly together with the streams that are read by the stateful operators. Hence, access to the key/value state is only possible on keyed streams, after a keyBy() function, and is restricted to the values associated with the current event’s key. Aligning the keys of streams and state makes sure that all state updates are local operations, guaranteeing consistency without transaction overhead. This alignment also allows Flink to redistribute the state and adjust the stream partitioning transparently.

For more information, see the documentation on state.

雖然數據流中的許多操做一次只查看一個單獨的事件(例如事件解析器)，可是有些操做記錄了跨多個事件的信息(例如窗口操做符)。這些操做稱爲有狀態的（stateful）。

有狀態操做的狀態維護在可視爲嵌入式鍵/值存儲。狀態與有狀態操做符讀取的流一塊兒被嚴格地分區和分佈。所以，只有經過keyBy()函數執行後的keyed streams才能訪問鍵/值狀態，而且只能訪問與當前事件的鍵關聯的值。將流的鍵與狀態對齊能夠確保全部狀態更新都是本地操做，從而確保一致性，而不須要事務開銷。這種對齊還容許Flink從新分配狀態並透明地調整流分區。

更多信息，詳見相關文檔 state

容錯檢查點（Checkpoints for Fault Tolerance）

Flink implements fault tolerance using a combination of stream replay and checkpointing. A checkpoint is related to a specific point in each of the input streams along with the corresponding state for each of the operators. A streaming dataflow can be resumed from a checkpoint while maintaining consistency (exactly-once processing semantics) by restoring the state of the operators and replaying the events from the point of the checkpoint.
The checkpoint interval is a means of trading off the overhead of fault tolerance during execution with the recovery time (the number of events that need to be replayed).

The description of the fault tolerance internals provides more information about how Flink manages checkpoints and related topics. Details about enabling and configuring checkpointing are in the checkpointing API docs.

Flink使用流回放（stream replay）和檢查點（checkpoint）的組合實現容錯。檢查點與每一個輸入流中的特定點以及每一個操做符的對應狀態相關。流數據流能夠從檢查點恢復，同時經過恢復操做符的狀態並從檢查點重播事件來保持一致性(精確地說，一次處理語義)。

檢查點間隔是一種用恢復時間（須要重播的事件數量）來抵消執行期間容錯開銷的方法。

fault tolerance internals 提供了關於Flink如何管理檢查點和相關主題的更多信息。啓用和配置檢查點的詳細信息在checkpointing API docs中。

批處理（Batch on Streaming）

Flink executes batch programs as a special case of streaming programs, where the streams are bounded (finite number of elements). A DataSet is treated internally as a stream of data. The concepts above thus apply to batch programs in the same way as well as they apply to streaming programs, with minor exceptions:

Fault tolerance for batch programs does not use checkpointing. Recovery happens by fully replaying the streams. That is possible, because inputs are bounded. This pushes the cost more towards the recovery, but makes the regular processing cheaper, because it avoids checkpoints.

Stateful operations in the DataSet API use simplified in-memory/out-of-core data structures, rather than key/value indexes.

The DataSet API introduces special synchronized (superstep-based) iterations, which are only possible on bounded streams. For details, check out the iteration docs.

Flink把批處理做爲流式處理的一種特殊狀況，即有界流(有限數量的元素)。DataSet在內部被視爲數據流。所以，上述概念一樣適用於批處理程序，只有少數不一樣:

批處理程序的容錯不使用檢查點。恢復是經過徹底重放流來實現的。這是合理的，由於輸入是有界的。使用常規處理更簡單，由於它避免了使用檢查點。
DataSet API中的有狀態操做使用簡化的內存/內核外（in-memory/out-of-core）數據結構，而不是鍵/值索引（key/value indexes）。
DataSet API引入了特殊的同步(基於超步)迭代，這隻可能在有界流上實現。有關詳細信息，請查看iteration文檔。