One SQL to Rule Them All – an Efficient and Syntactically Idiomatic Approach to Management of Streams and Tablesios
用SQL統一全部:一種有效的、語法慣用的流和表管理方法web
- syntactically 句法上;語法上;句法;句法性地;句法特徵
- idiomatic [ˌɪdiəˈmætɪk] 慣用的;合乎語言習慣的;習語的
- approach [əˈproʊtʃ]
v.(在距離或時間上)靠近,接近;接洽;建議;要求;(在數額、水平或質量上)接近
n.(待人接物或思考問題的)方式,方法,態度;(距離和時間上的)靠近,接近;接洽;建議;要求
Apache Calcite:Edmon Begoli、Julian Hyde;數據庫
Apache Beam:Tyler Akidau、Kenneth Knowles;express
Apache Flink:Fabian Hueskewindows
Kathryn Knightapi
Real-time data analysis and management are increasingly critical for today’s businesses. SQL is the de facto lingua franca for these endeavors, yet support for robust streaming analysis and management with SQL remains limited. Many approaches restrict semantics to a reduced subset of features and/or require a suite of non-standard constructs. Additionally, use of event timestamps to provide native support for analyzing events according to when they actually occurred is not pervasive, and often comes with important limitations.網絡
- increasingly [ɪnˈkriːsɪŋli] adv.愈來愈多地;不斷增長地
- critical [ˈkrɪtɪkl] adj. 批評的;批判性的;挑剔的;極重要的;關鍵的;相當緊要的;嚴重的;不穩定的;可能有危險的
- de [di] prep.(屬於)…的;從;關於;屬於
- facto [ˈfæktoʊ] n. 實際上;事實上;根據事實或行爲
- lingua franca[ˌlɪŋɡwə ˈfræŋkə] n. (母語不一樣的人共用的)通用語
- endeavor [ɪnˈdɛvər] n./v. 努力,盡力,企圖,試圖
- yet [jet] adv. 用於否認句和疑問句,談論還沒有發生但可能發生的事;如今;即刻;立刻;從如今起直至某一時間;還 conj. 可是;然而
- robust [roʊˈbʌst] adj. 強健的;強壯的;結實的;耐用的;堅固的;強勁的;富有活力的
- remains [rɪˈmeɪnz] n. 剩餘物;殘留物;剩飯菜;古代遺物;古蹟;遺蹟;遺址;遺體;遺骸 v. 仍然是;保持不變;剩餘;遺留;繼續存在;仍需去作(或說、處理)
- restrict [rɪˈstrɪkt] v. 限制,限定(數量、範圍等);束縛;妨礙;阻礙;(以法規)限制
- semantic [sɪˈmæntɪk] adj. 語義的
- pervasive [pərˈveɪsɪv] adj.遍及的;充斥各處的;瀰漫的;無處不在
對於當今的企業來講,實時數據分析和管理愈來愈重要。 SQL是這些努力的事實上的通用語言,但對SQL的強大流分析和管理的支持仍然有限。 許多方法將語義限制爲減小的特徵子集和/或須要一套非標準構造。 此外,使用事件時間戳來提供本機支持,以便根據事件實際發生的時間來分析事件並非廣泛存在的,而且一般具備重要的侷限性。session
We present a three-part proposal for integrating robust streaming into the SQL standard, namely: (1) time-varying relations as a foundation for classical tables as well as streaming data, (2) event time semantics, (3) a limited set of optional keyword extensions to control the materialization of time-varying query results. Motivated and illustrated using examples and lessons learned from implementations in Apache Calcite, Apache Flink, and Apache Beam, we show how with these minimal additions it is possible to utilize the complete suite of standard.閉包
- proposal [prəˈpoʊzl] n. 提議;建議;動議;求婚
- integrating [ˈɪntɪɡreɪtɪŋ] v.(使)合併,成爲一體;(使)加入,融入羣體,整合
- namely [ˈneɪmli] adv. 即;也就是
- foundation [faʊnˈdeɪʃn] n. 地基;房基;基礎;基本原理;根據;基金會
- materialization n. 物質化;具體化;實現;物化
- motivated [ˈmoʊtɪveɪtɪd] v. 成爲…的動機;是…的緣由;推進…甘願苦幹;激勵;激發;(就所說的話)給出理由;說明…的緣由;啓發
- illustrated [ˈɪləstreɪtɪd] v. 加插圖於;給(書等)作圖表;(用示例、圖畫等)說明,解釋;代表…真實;顯示…存在
- implementations [ˌɪmpləmɛnˈteɪʃənz] 實現;實施;實現工具;實做
- utilize [ˈjuːtəlaɪz] v. 使用;利用;運用;應用
咱們提出了一個將強健流式集成到SQL標準中的三部分提議,即:(1)時變關係做爲經典表格和流數據的基礎,(2)事件時間語義,(3)有限集合 可選的關鍵字擴展,用於控制時變查詢結果的具體化。 經過使用Apache Calcite,Apache Flink和Apache Beam中的實現中的示例和經驗教訓進行激勵和說明,咱們展現瞭如何利用這些最小的附加功能來使用完整的標準套件。架構
The thesis of this paper, supported by experience developing large open-source frameworks supporting real-world streaming use cases, is that the SQL language and relational model, as-is and with minor non-intrusive extensions, can be very effective for manipulation of streaming data.
- thesis [ˈθiːsɪs] n. 論文;畢業論文;學位論文;命題;論題
本文的論文在開發支持實際流式使用案例的大型開源框架的經驗的支持下,是SQL語言和關係模型,原樣而且具備次要的非侵入式擴展,能夠很是有效地操做流數據。
Our motivation is two-fold. First, we want to share our observations, innovations, and lessons learned while working on stream processing in widely used open source frameworks. Second, we want to inform the broader database community of the work we are initiating with the international SQL standardization body [26] to standardize streaming SQL features and extensions, and to facilitate a global dialogue on this topic (we discuss proposed extensions in Section 6).
- motivation [ˌmoʊtɪˈveɪʃn] n. 動機;動力;誘因
- observations [ˌɑbzərˈveɪʃənz] n. 觀察;觀測;監視;(尤指據所見、所聞、所讀而做的)評論
- facilitate [fəˈsɪlɪteɪt] v. 促進;促使;使便利
- dialogue [ˈdaɪəlɔːɡ] n. (書、戲劇或電影中的)對話,對白;(尤指集體或國家間爲解決問題、結束爭端等進行的)對話
咱們的動機是雙重的。 首先,咱們但願在普遍使用的開源框架中處理流處理時分享咱們的觀察,創新和經驗教訓。 其次,咱們但願向更普遍的數據庫社區通報咱們正在與國際SQL標準化機構[26]開展的工做,以標準化流SQL特性和擴展,並促進關於該主題的全球對話(咱們在第6節討論建議的擴展))。
Combined, tables and streams cover the critical spectrum of business operations ranging from strategic decision making supported by historical data to near- and real-time data used in interactive analysis. SQL has long been a dominant technology for querying and managing tables of data, backed by decades of research and product development.
- spectrum [ˈspektrəm] n. 譜;光譜;聲譜;波譜;頻譜;範圍;各層次;系列;幅度
- strategic [strəˈtiːdʒɪk] adj. 根據全局而安排的;戰略性的;戰略上的
表格和流程結合起來涵蓋了業務運營的關鍵範圍,從歷史數據支持的戰略決策到交互式分析中使用的近實時數據。 在數十年的研究和產品開發的支持下,SQL長期以來一直是查詢和管理數據表的主導技術。
We believe, based on our experience and nearly two decades of research on streaming SQL extensions, that using the same SQL semantics in a consistent manner is a productive and elegant way to unify these two modalities of data: it simplifies learning, streamlines adoption, and supports development of cohesive data management systems. Our approach is therefore to present a unified way to manage both tables and streams of data using the same semantics.
- elegant [ˈelɪɡənt] adj. 優美的;文雅的;漂亮雅緻的;陳設講究的;精美的;簡練的;簡潔的;簡明的
- modalities [məˈdælətiz] n. 形式;樣式;方式;形態;情態;感受模式;感受形式
- cohesive [koʊˈhiːsɪv] adj. 結成一個總體的;使結合的;使凝結的;使內聚的
咱們相信,根據咱們的經驗和近二十年的流式SQL擴展研究,以一致的方式使用相同的SQL語義是統一這兩種數據模式的高效優雅方式:它簡化了學習,簡化了採用,以及 支持有凝聚力的數據管理系統的開發。 所以,咱們的方法是使用相同的語義提供統一管理表和數據流的方法。
Building upon the insights gained in prior art and through our own work, we propose these contributions in this paper:
在現有技術和咱們本身的工做中得到的看法的基礎上,咱們在本文中提出了這些貢獻:
Time-varying relations: First, we propose time-varying relations as a common foundation for SQL, underlying classic point-in-time queries, continuously updated views, and novel streaming queries. A time-varying relation is just what it says: a relation that changes over time, which can also be treated as a function, mapping each point in time to a static relation. Critically, the full suite of existing SQL operators remain valid on time-varying relations (by the natural point-wise application), providing maximal functionality with minimal cognitive overhead. Section 3.1 explores this in detail and Section 6.2 discusses its relationship to standard SQL.
- novel [ˈnɑːvl] n. (長篇)小說 adj. 新穎的;不同凡響的;珍奇的
- point-wise [pɔɪnt waɪz] 逐點;點方式
- cognitive [ˈkɑːɡnətɪv] adj. 認知的;感知的;認識的
時變關係:首先,咱們提出時變關係做爲SQL的基礎,基礎經典時間點查詢,不斷更新的視圖和新穎的流式查詢。 時變關係就是它所說的:隨時間變化的關係,也能夠被視爲函數,將每一個時間點映射到靜態關係。 相當重要的是,全套現有的SQL運算符在時變關係(經過天然的逐點應用程序)上仍然有效,提供最大的功能和最小的認知開銷。 3.1節詳細探討了這一點,6.2節討論了它與標準SQL的關係。
Event time semantics: Second, we present a concise proposal for enabling robust event time streaming semantics. The extensions we propose preserve all existing SQL semantics and fit in well. By virtue of utilizing time-varying relations as the underlying primitive concept, we can freely combine classical SQL and event time extensions. Section 3.2 describes the necessary foundations and Sections 6.2-6.4 describe our proposed extensions for supporting event time.
- concise [kənˈsaɪs] adj. 簡明的;簡練的;簡潔的;簡略的;簡縮的
- proposal [prəˈpoʊzl] n. 提議;建議;動議;求婚
事件時間語義:其次,咱們提出了一個簡明的提議,用於啓用強大的事件時間流語義。 咱們建議的擴展保留了全部現有的SQL語義而且很好地適應。 經過利用時變關係做爲底層原始概念,咱們能夠自由地組合經典SQL和事件時間擴展。 3.2節描述了必要的基礎,第6.2-6.4節描述了咱們提議的支持事件時間的擴展。
Materialization control: Third, we propose a modest set of materialization controls to provide the necessary flexibility for handling the breadth of modern streaming use cases.
- materialization n. 物質化;具體化;實現
物化控制:第三,咱們提出了一套適度的物化控制,以便爲處理現代流媒體用例的廣度提供必要的靈活性。
- duality [duːˈæləti] n. 雙重性;二元性
- speculative [ˈspekjələtɪv] adj. 推測的;猜想的;推斷的;揣摩的;忖度的;試探的;投機性的;風險性的
- instantaneously adv. 即刻;突如其來地
- instantaneous [ˌɪnstənˈteɪniəs] adj. 當即的;馬上的;瞬間的
- coalescence [ˌkoʊəˈlɛsəns] n. 合併;聯合;接合
Taken together, we believe these contributions provide a solid foundation for utilizing the full breadth of standard SQL in a streaming context, while providing additional capabilities for robust event time handling in the context of classic point-in-time queries.
總而言之,咱們相信這些貢獻爲在流式上下文中利用標準SQL的全面性提供了堅實的基礎,同時在經典時間點查詢的上下文中提供了強大的事件時間處理的附加功能。
Stream processing and streaming SQL, in their direct forms, as well as under the guise of complex event processing (CEP) [18] and continuous querying [11], have been active areas of database research since the 1990s. There have been significant developments in these fields, and we present here a brief survey, by no means exhaustive, of research, industrial, and open source developments relevant to our approach.
- guise [ɡaɪz] n. 表現形式;外貌;假裝;外表 vi. 使化裝假裝
- significant [sɪɡˈnɪfɪkənt] adj. 有重大意義的;顯著的;有某種意義的;別有含義的;意味深長的
- exhaustive [ɪɡˈzɔːstɪv] adj. 詳盡的;完全的;全面的
- relevant [ˈreləvənt] adj. 緊密相關的;切題的;有價值的;有意義的
流式處理和流式SQL以其直接形式,以復瑣事件處理(CEP)[18]和連續查詢[11]爲表現形式,自20世紀90年代以來一直是數據庫研究的活躍領域。 這些領域已取得重大進展,咱們在此提供與咱們的方法相關的研究,工業和開源開發的簡要調查,但並不是詳盡無遺。
Work on stream processing goes back to the introduction of the Tapestry [25] system in 1992, intended for content-based filtering of emails and message board documents using a subset of SQL called TQL [37]. Several years later, Liu et al. introduced OpenCQ, an information delivery system driven by user- or application-specified events, the updates of which only occur at specified triggers that don‘t require active monitoring or interference [30]. That same group developed CONQUER, an update/capture system for efficiently monitoring continuous queries over the web using a three-tier architecture designed to share information among variously structured data sources [29]. Shortly thereafter NiagaraCQ emerged, an XML-QL-based query system designed to address scalability issues of continuous queries by grouping similar continuous queries together via dynamic regrouping [21]. OpenCQ, CONQUER, and NiagaraCQ each support arrival and timer-based queries over a large network (i.e. the Internet). However, neither Tapestry nor OpenCQ address multiple query optimization, and NiagaraCQ ignores query execution timings and doesn’t specify time intervals [27].
- intend [ɪnˈtend] v. 打算;計劃;想要;意指
- delivery [dɪˈlɪvəri] n. 傳送;遞送;交付;分娩;演講方式;表演風格
- thereafter [ˌðerˈæftər] adv. 以後;此後;之後
關於流處理的工做能夠追溯到1992年Tapestry [25]系統的引入,該系統用於使用稱爲TQL的SQL子集對電子郵件和留言板文檔進行基於內容的過濾[37]。幾年後,劉等人。介紹了OpenCQ,一種由用戶或應用程序指定的事件驅動的信息傳遞系統,其更新僅發生在不須要主動監視或干擾的指定觸發器上[30]。同一個小組開發了CONQUER,這是一個更新/捕獲系統,用於使用三層架構有效監控Web上的連續查詢,該架構旨在在各類結構化數據源之間共享信息[29]。此後不久,NiagaraCQ出現了一個基於XML-QL的查詢系統,旨在經過動態重組將相似的連續查詢分組在一塊兒來解決連續查詢的可擴展性問題[21]。 OpenCQ,CONQUER和NiagaraCQ都支持在大型網絡(即因特網)上的到達和基於計時器的查詢。可是,Tapestry和OpenCQ都沒有解決多個查詢優化問題,NiagaraCQ忽略了查詢執行時序,也沒有指定時間間隔[27]。
In 2003, Arasu, Babu and Widom introduced the Continuous Query Language (CQL), a declarative language similar to SQL and developed by the STREAM project team at Stanford process both streaming and static data, this work was the first to introduce an exact semantics for general-purpose, declarative continuous queries over streams and relations. It formalized a form of streams, updateable relations, and their relationships; moreover, it defined abstract semantics for continuous queries constructed on top of relational query language concepts [8, 9].
- declarative [dɪˈklærətɪv] adj. 陳述的
2003年,Arasu,Babu和Widom推出了連續查詢語言(CQL),這是一種相似於SQL的聲明性語言,由STREAM項目團隊在斯坦福大學開發流程和靜態數據開發,這項工做是第一個引入精確語義的工做。 對流和關係的通用,聲明性連續查詢。 它造成了一種流,可更新的關係及其關係; 此外,它爲在關係查詢語言概念之上構建的連續查詢定義了抽象語義[8,9]。
CQL defines three classes of operators: relation-to-relation, stream-to-relation, and relation-to-stream. The core operators, relation-to-relation, use a notation similar to SQL. Stream-to-relation operators extract relations from streams using windowing specifications, such as sliding and tumbling windows. Relation-to-stream operators include the Istream (insert stream), Dstream (delete stream), and Rstream (relation stream) operators [7]. Specifically, these three special operators are defined as follows:
CQL定義了三類運算符:關係到關係,流到關係和關係到流。 核心運算符(relation-to-relation)使用相似於SQL的表示法。 流到關係運算符使用窗口規範從流中提取關係,例如滑動和翻滾窗口。 關係到流的運算符包括Istream(插入流),Dstream(刪除流)和Rstream(關係流)運算符[7]。 具體來講,這三個特殊運算符定義以下:
(1) Istream(R) contains all (r ,T ) where r is an element of R at T but not T − 1
(2) Dstream(R) contains all (r ,T ) where r is an element of R at T − 1 but not at T
(3) Rstream(R) contains all (r ,T ) where r is an element of R at time T [39]
The kernel of many ideas lies within these operators. Notably, time is implicit. The STREAM system accommodates out-of-order data by buffering it on intake and presenting it to the query processor in timestamp order, so the CQL language does not address querying of out-of-order data.
- implicit [ɪmˈplɪsɪt] adj. 含蓄的;不直接言明的;成爲一部分的;內含的;徹底的;無疑問的
- accommodates [əˈkɑːmədeɪts] v. 爲(某人)提供住宿(或膳宿、座位等);容納;爲…提供空間;考慮到;顧及
- intake [ˈɪnteɪk] n. (食物、飲料等的)攝取量,吸入量;(必定時期內)歸入的人數;(機器上的液體、空氣等的)進口
許多想法的核心在於這些運算符。 值得注意的是,時間是隱含的。 STREAM系統經過在進入時緩衝它並按時間戳順序將其呈現給查詢處理器來適應無序數據,所以CQL語言不會解決對無序數據的查詢。
An important limitation of CQL is that time refers to a logical clock that tracks the evolution of relations and streams, not time as expressed in the data being analyzed, which means time is not a first-class entity one can observe and manipulate alongside other data.
- alongside [əˌlɔːŋˈsaɪd] prep. 在…旁邊;沿着…的邊;與…一塊兒;與…同時
CQL的一個重要限制是時間是指跟蹤關係和流的演變的邏輯時鐘,而不是在被分析的數據中表達的時間,這意味着時間不是能夠與其餘數據一塊兒觀察和操縱的第一類實體。
The Aurora system was designed around the same time as STREAM to combine archival, spanning, and real-time monitoring applications into one framework. Like STREAM, queries are structured as DirectedAcyclic Graphs with operator vertices and data flow edges [3]. Aurora was used as the query processing engine for Medusa, a load management system for distributed stream processing systems [13], and Borealis, a stream processing engine developed by Brandeis, Brown and MIT. Borealis uses Medusa’s load management system and introduced a new means to explore fault-tolerance techniques (results revision and query modification) and dynamic load distribution [2]. The optimization processes of these systems still do not take event specification into account. Aurora’s GUI provides custom operators, designed to handle delayed or missing data, with four specific novel functions: timeout capability, out-of-order input handling, user-defined extendibility, and a resampling operator [34]. These operators are partially based on linear algebra / SQL, but also borrow from AQuery and SEQ [12].
- linear algebra [ˈlɪniər ˈældʒɪbrə] 線性代數;線形代數;線性代數;線代;高等代數
Aurora系統與STREAM大體同時設計,將歸檔,跨越和實時監控應用程序組合到一個框架中。與STREAM同樣,查詢被構造爲具備操做符頂點和數據流邊緣的DirectedAcyclic Graphs [3]。 Aurora被用做Medusa的查詢處理引擎,Medusa是一個用於分佈式流處理系統的負載管理系統[13],Borealis是由Brandeis,Brown和MIT開發的流處理引擎。 Borealis使用Medusa的負載管理系統,並引入了一種新方法來探索容錯技術(結果修訂和查詢修改)和動態負載分配[2]。這些系統的優化過程仍然沒有考慮事件規範。 Aurora的GUI提供自定義操做符,旨在處理延遲或丟失的數據,具備四個特定的新功能:超時功能,無序輸入處理,用戶定義的可擴展性和重採樣操做符[34]。這些運算符部分基於線性代數/ SQL,但也借鑑了AQuery和SEQ [12]。
IBM introduced SPADE, also known as System S [24], in 2008; this later evolved into InfoSphere Streams, a stream analysis platform which uses SPL, its own native processing language, which allows for event-time annotation.
IBM於2008年推出了SPADE,也稱爲System S [24]; 後來演變爲InfoSphere Streams,這是一個流分析平臺,它使用SPL(它本身的本機處理語言),容許事件時間註釋。
While streaming SQL has been an area of active research for almost three decades, stream processing itself has enjoyed recent industry attention, and many current streaming systems have adopted some form of SQL functionality.
雖然流式SQL近三十年來一直是一個活躍的研究領域,但流處理自己已經引發了業界的關注,並且許多當前的流式系統都採用了某種形式的SQL功能。
Apache Spark Spark’s Dataset API is a high-level declarative API built on top of Spark SQL’s optimizer and execution engine. Dataset programs can be executed on finite data or on streaming data. The streaming variant of the Dataset API is called Structured Streaming [10]. Structured Streaming queries are incrementally evaluated and by default processed using a micro-batch execution engine, which processes data streams as a series of small batch jobs and features exactly-once fault-tolerance guarantees.
- finite [ˈfaɪnaɪt] adj. 有限的;有限制的;限定的
- incrementally 遞增地;逐漸地;增長地;漸進性的;增量模式
- evaluated [ɪˈvæljueɪtɪd] v. 估計;評價;評估
- guarantees [ˌɡærənˈtiːz] n. 保證;擔保;保修單;保用證書;起保證做用的事物
Apache Spark Spark的Dataset API是一個基於Spark SQL優化器和執行引擎構建的高級聲明式API。 數據集程序能夠在有限數據或流數據上執行。 Dataset API的流式變體稱爲Structured Streaming [10]。 結構化流式查詢是逐步評估的,默認狀況下使用微批處理執行引擎進行處理,該引擎將數據流做爲一系列小批量做業處理,並具備徹底的容錯保證。
KSQL Confluent’s KSQL [28] is built on top of Kafka Streams, the stream processing framework of the Apache Kafka project. KSQL is a declarative wrapper around Kafka Streams and defines a custom SQL-like syntax to expose the idea of streams and tables [33]. KSQL focuses on eventually consistent, materialized view semantics.
KSQL Confluent的KSQL [28]構建於Kafka Streams之上,Kafka Streams是Apache Kafka項目的流處理框架。 KSQL是Kafka Streams的聲明性包裝器,它定義了一個相似於SQL的自定義語法來揭示流和表的概念[33]。 KSQL專一於最終一致的物化視圖語義。
Apache Flink [23] features two relational APIs, the LINQstyle [31] Table API and SQL, the latter of which has been adopted by enterprises like Alibaba, Huawei, Lyft, Uber, and others. Queries in both APIs are translated into a common logical plan representation and optimized using Apache Calcite [19], then optimized and execute as batch or streaming applications.
Apache Flink [23]提供了兩個關係API,LINQstyle [31] Table API和SQL,後者已被阿里巴巴,華爲,Lyft,Uber等企業採用。 兩個API中的查詢都被轉換爲通用的邏輯計劃表示,並使用Apache Calcite [19]進行優化,而後做爲批處理或流應用程序進行優化和執行。
Apache Beam [15] has recently added SQL support, developed with a careful eye towards Beam’s unification of bounded and unbounded data processing [5]. Beam currently implements a subset of the semantics proposed by this paper, and many of the proposed extensions have been informed by our experiences with Beam over the years.
Apache Beam [15]最近添加了SQL支持,開發時仔細考慮了Beam對有界和無界數據處理的統一[5]。 Beam目前實現了本文提出的語義的一個子集,而且許多提議的擴展已經經過咱們多年來對Beam的經驗獲得瞭解。
Apache Calcite [19] is widely used as a streaming SQL parser and planner/optimizer, notably in Flink SQL and Beam SQL. In addition to SQL parsing, planning and optimization, Apache Calcite supports stream processing semantics which have, along with the approaches from Flink and Beam, influenced the work presented in this paper.
Apache Calcite [19]被普遍用做流式SQL解析器和規劃器/優化器,特別是在Flink SQL和Beam SQL中。 除了SQL解析,規劃和優化以外,Apache Calcite還支持流處理語義,這些語義與Flink和Beam的方法一塊兒影響了本文中的工做。
There are many other such systems that have added some degree of SQL or SQL-like functionality. A key difference in our new proposal in this work is that other systems are either limited to a subset of standard SQL or bound to specialized operators. Additionally, the other prominent implementations do not fully support robust event time semantics, which is foundational to our proposal.
- prominent [ˈprɑːmɪnənt] adj. 重要的;著名的;傑出的;顯眼的;顯著的;突出的;凸現的
還有許多其餘相似的系統增長了必定程度的SQL或相似SQL的功能。 咱們在這項工做中的新提案的一個關鍵區別是,其餘系統要麼限於標準SQL的子集,要麼綁定到專門的運算符。 此外,其餘突出的實現並不徹底支持強大的事件時間語義,這是咱們的提議的基礎。
In this paper, we synthesize the lessons learned from work on three of these systems - Flink, Beam, Calcite - into a new proposal for extending the SQL standard with the most essential aspects of streaming relational processing.
- synthesize [ˈsɪnθəsaɪz] v. (經過化學手段或生物過程)合成;(音響)合成;綜合
- essential [ɪˈsenʃl] adj. 徹底必要的;必不可少的;極其重要的;本質的;基本的;根本的
在本文中,咱們將從這三個系統(Flink,Beam,Calcite)的工做中學到的經驗綜合成一個新的提案,用於擴展SQL標準,其中包括流關係處理的最重要方面。
Our proposal for streaming SQL comes in two parts. The first, in this section, is conceptual groundwork, laying out concepts and implementation techniques that support the fundamentals of streaming operations. The second, in Section 6, builds on these foundations, identifies the ways in which standard SQL already supports streaming, and proposes minimal extensions to SQL to provide robust support for the remaining concepts. The intervening sections are dedicated to discussing the foundations through examples and lessons learned from our open source frameworks.
- concept [ˈkɑːnsept] n. 概念;觀念
- intervening [ˌɪntərˈviːnɪŋ] adj. 發生於其間的;介於中間的
咱們的流式SQL提議分爲兩部分。 本節的第一部分是概念性基礎,闡述了支持流操做基礎的概念和實現技術。 第二部分在第6節的基礎上,構建了這些基礎,肯定了標準SQL已經支持流式傳輸的方式,並提出了對SQL的最小擴展,以便爲其他概念提供強大的支持。 介入部分致力於經過咱們的開源框架中的示例和經驗教訓來討論基礎。
In the context of streaming, the key additional dimension to consider is that of time. When dealing with classic relations, one deals with relations at a single point in time. When dealing with streaming relations, one must deal with relations as they evolve over time. We propose making it explicit that SQL operates over time-varying relations, or TVRs.
- explicit [ɪkˈsplɪsɪt] adj. 清楚明白的;易於理解的;(說話)清晰的,明確的;直言的;坦率的;直截了當的;不隱晦的;不含糊的
在流媒體環境中,要考慮的關鍵附加維度是時間。 在處理經典關係時,人們會在一個時間點處理關係。 在處理流媒體關係時,必須處理隨着時間的推移而發展的關係。 咱們建議明確表示SQL在時變關係或TVR上運行。
A time-varying relation is exactly what the name implies: a relation whose contents may vary over time. The idea is compatible with the mutable database tables with which we are already familiar; to a consumer of such a table, it is already a time-varying relation.But such a consumer is explicitly denied the ability to observe or compute based on how the relation changes over time. A traditional SQL query or view can express a derived time-varying relation that evolves in lock step with its inputs: at every point in time, it is equivalent to querying its inputs at exactly that point in time. But there exist TVRs that cannot be expressed in this way, where time itself is a critical input.
- derived [dɪˈraɪvd] v. 得到;取得;獲得;(使)起源;(使)產生;派生
TVRs are not a new idea; they are explored in [8, 9, 33]. An important aspect of TVRs is that they may be encoded or materialized in many ways, notably as a sequence of classic relations (instantaneous relations, in the CQL parlance), or as a sequence of INSERT and DELETE operations. These two encodings are duals of one another, and correspond to the tables and streams well described by Sax et al. [33]. There are other useful encodings based on relation column properties. For example, when an aggregation is invertible, a TVR’s encoding may use aggregation differences rather than entire deletes and additions.
- parlance [ˈpɑːrləns] n. 說法;術語;用語
- correspond [ˌkɔːrəˈspɑːnd] v. 相一致;符合;相似於;至關於;通訊
- invertible adj. 相反的;可逆的;被翻過來的;被顛倒的
TVR不是一個新主意; 他們在[8,9,33]中進行了探索。 TVR的一個重要方面是它們能夠以多種方式編碼或實現,特別是做爲一系列經典關係(瞬時關係,在CQL用語中),或做爲一系列INSERT和DELETE操做。 這兩種編碼是彼此的雙重編碼,而且對應於Sax等人描述的表和流。[33]。 還有其餘基於關係列屬性的有用編碼。 例如,當聚合是可逆的時,TVR的編碼可使用聚合差別而不是整個刪除和添加。
Our main contribution regarding TVRs is to suggest that neither the CQL nor the Streams and Tables approaches go far enough: rather than defining the duality of streams and tables and then proceeding to treat the two as largely different, we should use that duality to our advantage. The key insight, stated but under-utilized in prior work, is that streams and tables are two representations for one semantic object. This is not to say that the representation itself is not interesting - there are use cases for materializing and operating on the stream of changes itself - but this is again a TVR and can be treated uniformly.
- neither nor [ˈnaɪðər nɔːr] 既不……也不;二者都不;均不
- go far [ɡoʊ fɑːr] 成功;經用;夠用
咱們對TVR的主要貢獻是建議CQL和Streams and Tables方法都不夠用:不是定義流和表的二元性,而是繼續將二者視爲大不相同,咱們應該利用這種二元性來發揮咱們的優點。 在先前的工做中陳述但未充分利用的關鍵看法是流和表是一個語義對象的兩種表示。 這並非說表示自己並不有趣 - 有用於實現變動流自己的用例 - 但這也是TVR而且能夠統一對待。
What’s important here is that the core semantic object for relations over time is always the TVR, which by definition supports the entire suite of relational operators, even in scenarios involving streaming data. This is critical, because it means anyone who understands enough SQL to solve a problem in a non-streaming context still has the knowledge required to solve the problem in a streaming context as well.
這裏重要的是,隨着時間的推移,關係的核心語義對象始終是TVR,根據定義,TVR支持整套關係運算符,即便在涉及流數據的場景中也是如此。 這很關鍵,由於這意味着任何理解足夠的SQL來解決非流式上下文中的問題的人仍然具備在流式上下文中解決問題所需的知識。
Our second contribution deals with event time semantics. Many approaches fall short of dealing with the inherent independence of event time and processing time. The simplest failure is to assume data is ordered according to event time. In the presence of mobile applications, distributed systems, or even just sharded archival data, this is not the case. Even if data is in order according to event time, the progression of a logical clock or processing clock is unrelated to the scale of time as the events actually happened – one hour of processing time has no relation to one hour of event time. Event time must be explicitly accounted for to achieve correct results.
- inherent [ɪnˈhɪrənt] adj. 固有的;內在的
咱們的第二個貢獻涉及事件時間語義。 許多方法都沒法處理事件時間和處理時間的固有獨立性。 最簡單的失敗是假設數據是根據事件時間排序的。 在存在移動應用程序,分佈式系統或甚至只是分片存檔數據的狀況下,狀況並不是如此。 即便根據事件時間數據是有序的,邏輯時鐘或處理時鐘的進展與事件實際發生的時間尺度無關 - 一小時的處理時間與一小時的事件時間無關。 必須明確說明事件時間才能得到正確的結果。
The STREAM system includes heartbeats as an optional feature to buffer out-of-order data and feed it in-order to the query processor. This introduces latency to allow timestamp skew. Millwheel [4] based its processing instead on watermarks, directly computing on the out-of-order data along with metadata about how complete the input was believed to be. This approach was further extended in Google’s Cloud Dataflow [5], which pioneered the out-of-order processing model adopted in both Beam and Flink.
- skew [skjuː] v. 歪曲;曲解;使不公允;影響…的準確性;偏離;歪斜
- pioneer [ˌpaɪəˈnɪr] n. 先鋒;先驅;帶頭人;開發者;拓荒者
STREAM系統包括心跳做爲可選功能,用於緩衝無序數據並將其按順序提供給查詢處理器。 這引入了延遲以容許時間戳偏斜。 Millwheel [4]的處理基於水印,直接計算無序數據以及有關輸入被認爲是完整的元數據。 這種方法在Google的Cloud Dataflow [5]中進一步擴展,它開創了Beam和Flink採用的無序處理模型。
The approach taken in KSQL [33] is also to process the data in arrival order. Its windowing syntax is bound to specific types of event-time windowing implementations provided by the system (rather than allowing arbitrary, declarative construction via SQL). Due to its lack of support for watermarks, it is unsuitable for use cases like notifications where some notion of completeness is required, instead favoring an eventual consistency with a polling approach. We believe a more general approach is necessary to serve the full breadth of streaming use cases.
- arbitrary [ˈɑːrbɪtreri] adj. 任意的;武斷的;爲所欲爲的;專橫的;專制的
- notion [ˈnoʊʃn] n. 觀念;信念;理解
- favoring [ˈfeɪvərɪŋ] v. 喜好,偏心;贊同;有利於,便於
KSQL [33]採用的方法也是按到達順序處理數據。 它的窗口語法綁定到系統提供的特定類型的事件時間窗口實現(而不是容許經過SQL進行任意的聲明性構造)。 因爲缺少對水印的支持,它不適用於須要完整性概念的通知等用例,而是傾向於最終與輪詢方法保持一致。 咱們認爲有必要採用更通用的方法來提供全面的流媒體用例。
We propose to support event time semantics via two concepts: explicit event timestamps and watermarks. Together, these allow correct event time calculation, such as grouping into intervals (or windows) of event time, to be effectively expressed and carried out without consuming unbounded resources.
咱們建議經過兩個概念來支持事件時間語義:顯式事件時間戳和水印。 這些容許正確的事件時間計算,例如分組到事件時間的間隔(或窗口),以有效地表達和執行,而不消耗無限制的資源。
To perform robust stream processing over a time-varying relation, the rows of the relation should be timestamped in event time and processed accordingly, not in arrival order or processing time.
爲了在時變關係上執行魯棒的流處理,關係的行應該在事件時間中加時間戳並相應地處理,而不是在到達順序或處理時間。
A watermark is a mechanism in stream processing for deterministically or heuristically defining a temporal margin of completeness for a timestamped event stream. Such margins are used to reason about the completeness of input data being fed into temporal aggregations, allowing the outputs of such aggregates to be materialized and resources to be released only when the input data for the aggregation are sufficiently complete. For example, a watermark might be compared against the end time of an auction to determine when all valid bids for said auction have arrived, even in a system where events can arrive highly out of order. Some systems provide configuration to allow sufficient slack time for events to arrive.
- mechanism [ˈmekənɪzəm] n. 機械裝置;機件;方法;機制;(生物體內的)機制,構造
- deterministically adv. 確切地
- heuristically adv. 啓發式地;試探性地
- temporal [ˈtempərəl] adj. 世間的;世俗的;現世的;時間的;太陽穴的;顳的
- sufficiently [sə'fɪʃ(ə)ntli] adv. 足以;十分;充分地;最大限度地
- auction [ˈɔːkʃn] n. 拍賣 v. 拍賣
- determine [dɪˈtɜːrmɪn] v. 查明;測定;準確算出;決定;造成;支配;影響;肯定;裁決;安排
- slack [slæk] adj. 不緊的;鬆弛的;蕭條的;冷清的;清淡的;懈怠的;不用心的;敷衍了事的;吊兒郎當的
水印是流處理中用於肯定性地或啓發式地定義時間戳事件流的完整性的時間餘量的機制。 這些餘量用於推斷輸入數據的完整性被饋送到時間聚合中,容許這些聚合的輸出物化,而且僅當聚合的輸入數據足夠完整時才釋放資源。 例如,能夠將水印與拍賣的結束時間進行比較,以肯定什麼時候全部有效的所述拍賣的出價都已到達,即便在事件可能高度無序到達的系統中也是如此。 一些系統提供配置以容許事件到達的足夠鬆弛時間。
More formally, a watermark is a monotonic function from processing time to event time. For each moment in processing time, the watermark specifies the event timestamp up to which the input is believed to be complete at that point in processing time. In other words, if a watermark observed at processing time y has value of event time x, it is an assertion that as of processing time y, all future records will have event timestamps greater than x.
- monotonic adj. 單調的;無變化的
更正式地說,水印是從處理時間到事件時間的單調函數。 對於處理時間中的每一個時刻,水印指定事件時間戳,在該處理時間內輸入被認爲完成。 換句話說,若是在處理時間y處觀察到的水印具備事件時間x的值,則斷言:從處理時間y開始,全部將來記錄將具備大於x的事件時間戳。
Our third contribution deals with shaping the way relations are materialized, providing control over how the relation is rendered and when rows themselves are materialized.
咱們的第三個貢獻涉及塑造關係實現的方式,控制關係的呈現方式以及行自己什麼時候實現。
As described in [33], stream changelogs are a space-efficient way of describing the evolution of a TVR over time. Changelogs capture the element-by-element differences between two versions of a relation, in effect encoding the sequence of INSERT and DELETE statements used to mutate the relation over time. They also expose metadata about the evolution of the rows in the relation over time. For example: which rows are added or retracted, the processing time at which a row was materialized, and the revision index of a row for a given event-time interval.
如[33]中所述,流更改日誌是描述TVR隨時間演變的節省空間的方式。 更改日誌捕獲關係的兩個版本之間的元素 - 元素差別,實際上編碼用於隨時間改變關係的INSERT和DELETE語句的序列。 它們還會隨着時間的推移公開有關關係中行的演變的元數據。 例如:添加或撤消哪些行,實現行的處理時間以及給定事件時間間隔的行的修訂索引。
If dealing exclusively in TVRs, as recommended above, rendering a changelog stream of a TVR is primarily needed when materializing a stream-oriented view of that TVR for storage, transmission, or introspection (in particular, for inspecting metadata about the stream such as whether a change was additive or retractive). Unlike other approaches which treat stream changelogs as wholly different objects from relations (and the primary construct for dealing with relations over time), we propose representing the changelog as simply another time-varying relation. In that way, it can be operated on using the same machinery as a normal relation. Furthermore, it remains possible to declaratively convert the changelog stream view back into the original TVR using standard SQL (no special operators needed), while also supporting the materialization delays described next.
若是按照上面的建議專門處理TVR,則在實現TVR的面向流的視圖以進行存儲,傳輸或內省(特別是用於檢查關於流的元數據,例如是否)時,主要須要呈現TVR的更改日誌流。 變化是加性或回縮的)。 與將流變動日誌視爲徹底不一樣的關係(以及隨時間處理關係的主要結構)的其餘方法不一樣,咱們建議將變動日誌表示爲另外一個時變關係。 以這種方式,它可使用與正常關係相同的機器進行操做。 此外,仍然可使用標準SQL(不須要特殊操做符)以聲明方式將changelog流視圖轉換回原始TVR,同時還支持下面描述的實現延遲。
By modeling input tables and streams as time-varying relations, and the output of a query as a resulting time-varying relation, it may seem natural to define a query’s output as instantaneously changing to reflect any new input. But as an implementation strategy, this is woefully inefficient, producing an enormous volume of irrelevant updates for consumers that are only interested in final results. Even if a consumer is prepared for speculative non-final results, there is likely a maximum frequency that is useful. For example, for a real-time dashboard viewed by a human operator, updates on the order of second are probably sufficient. For top-level queries that are stored or transmitted for external consumption, how frequently and why output materialization occurs is fundamental business logic.
經過將輸入表和流建模爲時變關係,並將查詢的輸出建模爲結果時變關係,將查詢的輸出定義爲瞬時更改以反映任何新輸入彷佛很天然。 但做爲一種實施策略,這種效率很是低,爲消費者提供了大量無關的更新,這些消息只對最終結果感興趣。 即便消費者已經爲投機性非最終結果作好準備,也可能存在最有用的頻率。 例如,對於由操做員查看的實時儀表板,大約二階的更新可能就足夠了。 對於存儲或傳輸以供外部使用的頂級查詢,輸出實現的頻率和緣由是基本的業務邏輯。
There are undoubtedly many interesting ways to specify when materialization is desired. In Section 6.5 we make a concrete proposal based on experience with real-world use cases. But what is important is that the user has some way to express their requirements.
毫無疑問,有許多有趣的方法能夠指明什麼時候須要實現。 在6.5節中,咱們根據對實際用例的經驗提出了具體的建議。 但重要的是用戶能夠經過某種方式表達他們的要求。
To illustrate the concepts in Section 3, this section examines a concrete example query from the streaming SQL literature. We show how the concepts are used in the query and then walk through its semantics on realistic input.
爲了說明第3節中的概念,本節將從流式SQL文獻中檢查具體的示例查詢。 咱們將展現如何在查詢中使用這些概念,而後在實際輸入中查看其語義。
The following example is from the NEXMark benchmark[38] which was designed to measure the performance of stream query systems. The NEXMark benchmark extends the XMark benchmark [35] and models an online auction platform where users can start auctions for items and bid on items. The NEXMark data model consists of three streams, Person, Auction, and Bid, and a static Category table that holds details about items.
如下示例來自NEXMark基準[38],該基準旨在測量流查詢系統的性能。 NEXMark基準擴展了XMark基準[35],並創建了一個在線拍賣平臺,用戶能夠開始拍賣物品並對物品進行投標。 NEXMark數據模型由三個流組成:Person,Auction和Bid,以及一個包含項目詳細信息的靜態Category表。
From the NEXMark benchmark we chose Query 7, defined as: "Query 7 monitors the highest price items currently on auction. Every ten minutes, this query returns the highest bid (and associated itemid) in the most recent ten minutes." [38]. This is a continuously evaluated query which consumes a stream of bids as input and produces as output a stream of aggregates computed from finite windows of the input.
在NEXMark基準測試中,咱們選擇了查詢7,定義爲:「查詢7監控當前拍賣的最高價格項目。每隔十分鐘,此查詢將返回最近十分鐘內的最高出價(以及相關的itemid)。」[38]。 這是一個連續評估的查詢,它使用一個出價流做爲輸入,併產生一個從輸入的有限窗口計算出的聚合流做爲輸出。
Before we show a solution based on plain SQL, we present a variant [17] built with CQL [8] to define the semantics of Query 7:
在咱們展現基於純SQL的解決方案以前,咱們提出了一個用CQL [8]構建的變體[17]來定義Query 7的語義:
Every ten minutes, the query processes the bids of the previous ten minutes. It computes the highest price of the last ten minutes (subquery) and uses the value to select the highest bid of the last ten minutes. The result is appended to a stream. We won’t delve into the details of CQL’s dialect, but to note some aspects which we will not reproduce in our proposal:
每隔十分鐘,查詢將處理前十分鐘的出價。 它計算最後十分鐘(子查詢)的最高價格,並使用該值選擇過去十分鐘的最高出價。 結果將附加到流中。 咱們不會深刻研究CQL方言的細節,但要注意咱們在提案中不會重現的一些方面:
CQL makes explicit the concept of streams and relations, providing operators to convert a stream into a relation (RANGE in our example) and operators to convert a relation into a stream (Rstream in our example). Our approach is based on the single concept of a time-varying relation and does not strictly require conversion operators.
CQL明確了流和關係的概念,提供運算符將流轉換爲關係(在咱們的示例中爲RANGE)和運算符以將關係轉換爲流(在咱們的示例中爲Rstream)。 咱們的方法基於時變關係的單一律念,並不嚴格要求轉換運算符。
Time is implicit; the grouping into ten minute windows depends on timestamps that are attached to rows by the underlying stream as metadata. As discussed in Section 3.2, STREAM supports out-of-order timestamps by buffering and feeding to CQL in order so intervals of event time always correspond to contiguous sections of the stream. Our approach is to process out-of-order data directly by making event timestamps explicit and leveraging watermarks to reason about input completeness.
時間是隱含的; 分組爲十分鐘窗口取決於底層流做爲元數據附加到行的時間戳。 如3.2節所述,STREAM經過緩衝和饋送到CQL來支持無序時間戳,所以事件時間間隔始終對應於流的連續部分。 咱們的方法是經過明確事件時間戳並利用水印來推斷輸入完整性來直接處理無序數據。
Time moves in lock step for the whole query. There is no explicit condition that the window in the subquery corresponds to the window in the main query. We make this relationship explicit via a join condition.
時間在整個查詢的鎖定步驟中移動。 沒有明確的條件,子查詢中的窗口對應於主查詢中的窗口。 咱們經過鏈接條件使這種關係顯式化。
In contrast, here is Query 7 specified with our proposed extensions to standard SQL.
相反,這裏是咱們建議的標準SQL擴展指定的查詢7。
This query computes the same result, but does so using our proposed extensions to standard SQL (as well as SQL standard features from 2016). Noteworthy points:
此查詢計算相同的結果,但使用咱們建議的標準SQL擴展(以及2016年的SQL標準功能)。 值得注意的要點:
The column bidtime holds the time at which a bid occurred. In contrast to the prior query, timestamps are explicit data. Rows in the Bid stream do not arrive in order of bidtime.
列出價時間保留出價的時間。 與先前的查詢相反,時間戳是顯式數據。 出價流中的行未按出價時間順序到達。
The Bid stream is presumed to have a watermark, as described in Section 3.2, estimating completeness of BidTime as a lower bound on future timestamps in the bidtime column. Note that the requirement does not affect the basic semantics of the query. The same query can be evaluated without watermarks over a table that was recorded from the bid stream, yielding the same result.
如第3.2節所述,Bid流被假定爲具備水印,將BidTime的完整性估計爲bidtime列中將來時間戳的下限。 請注意,該要求不會影響查詢的基本語義。 能夠在從出價流中記錄的表上沒有水印的狀況下評估相同的查詢,從而產生相同的結果。
Tumble is a table-valued function [1] which assigns each row in the bid stream to the 10-minute interval containing bidtime. The output table TumbleBid has all the same columns as Bid plus two additional columns wstart and wend, which repesent the start and end of the tumbling window interval, respectively. The wend column contains timestamps and has an associated watermark that estimates completeness of TumbleBid relative to wend.
Tumble是一個表值函數[1],它將出價流中的每一行分配給包含出價時間的10分鐘間隔。 輸出表TumbleBid具備與Bid相同的列以及兩個附加列wstart和wend,它們分別重複翻滾窗口間隔的開始和結束。 wend列包含時間戳,而且具備相關的水印,用於估計TumbleBid相對於wend的完整性。
The GROUP BY TumbleBid.wend clause is where the watermark is used. Because the watermark provides a lower bound on not-yet-seen values for wend, it allows an implementation to reason about when a particular grouping of inputs is complete. This fact can be used to delay materialization of results until aggregates are known complete, or to provide metadata indicating as much.
GROUP BY TumbleBid.wend子句是使用水印的地方。 由於水印爲wend的還沒有看到的值提供了下限,因此它容許實現來推斷特定的輸入分組什麼時候完成。 此事實可用於延遲結果的實現,直到已知完成聚合,或提供指示儘量多的元數據。
As the Bid relation evolves over time, with newevents being added, the relation defined by this query also evolves. This is identical to instantaneous view semantics. We have not used the advanced feature of managing the materialization of this query.
隨着Bid關係隨着時間的推移而發展,添加了newevents,此查詢定義的關係也會發展。 這與瞬時視圖語義相同。 咱們尚未使用管理此查詢的具體化的高級功能。
Now let us apply this query to a concrete dataset to illustrate how it might be executed. As we’re interested in streaming data, we care not only about the data involved, but also when the system becomes aware of them (processing time), as well as where in event time they occurred, and the system’s own understanding of input completeness in the event-time domain (i.e., the watermark) over time. The example data set we will use is the following:
如今讓咱們將此查詢應用於具體數據集,以說明如何執行該查詢。 因爲咱們對流數據感興趣,咱們不只關心所涉及的數據,還關注系統什麼時候意識到它們(處理時間),以及事件發生的時間,以及系統自身對輸入完整性的理解 隨着時間的推移在事件時間域(即水印)中。 咱們將使用的示例數據集以下:
Here, the left column of times includes the processing times at which events occur within the system. The right column describes the events themselves, which are either the watermark advancing to a point in event time or a (bidtime, price, item) tuple being inserted into the stream.
- advancing [ədˈvænsɪŋ] adj. 年事漸高 v. (爲了進攻、威脅等)前進,行進;(知識、技術等)發展,進步;促進;推進
這裏,左列時間包括系統內發生事件的處理時間。 右欄描述事件自己,它們是前進到事件時間點的水印或插入到流中的(bidtime,price,item)元組。
The example SQL query in Listing 2 would yield the following results when executed on this dataset at time 8:21 (eliding most of the query body for brevity):
清單2中的示例SQL查詢在8:21時在此數據集上執行時會產生如下結果(爲了簡潔,省略了大部分查詢主體):
This is effectively the same output that would have been provided by the original CQL query, with the addition of explicit window start, window end, and event occurrence timestamps.
這其實是原始CQL查詢提供的相同輸出,添加了顯式窗口開始,窗口結束和事件發生時間戳。
However, this is a table view of the data set capturing a point-in-time view of the entire relation at query time, not a stream view. If we were to have executed this query earlier in processing time, say at 8:13, it would have looked very different due to the fact that only half of the input data had arrived by that time:
可是,這是捕獲查詢時整個關係的時間點視圖的數據集的表視圖,而不是流視圖。 若是咱們在處理時間早些時候執行了這個查詢,好比在8:13,那麼因爲到那時只有一半的輸入數據到達,它看起來會有很大不一樣:
In Section 6, we’ll describe how to write a query that creates a stream of output matching that from the original CQL query, and also why our approach is more flexible overall.
在第6節中,咱們將描述如何編寫一個查詢,該查詢建立與原始CQL查詢匹配的輸出流,以及爲何咱們的方法整體上更靈活。
Our proposed SQL extensions are informed by prior art and related work, and derived from experience in working on Apache Calcite, Flink, and Beam – open source frameworks with wide adoption across the industry and by other open source frameworks.
咱們提出的SQL擴展由現有技術和相關工做提供信息,並源自Apache Calcite,Flink和Beam的工做經驗 - 開源框架在整個行業和其餘開源框架中獲得普遍採用。
In Appendix B, we describe the general architectural properties of these three frameworks, the breadth of their adoption, and streaming implementations that exist today. Though the implementations thus far fall short of the full proposal we make in Section 6, they are a step in the right direction and have yielded useful lessons which have informed its evolution. Here we summarize those lessons:
在附錄B中,咱們描述了這三個框架的通常架構屬性,它們的採用範圍以及當前存在的流實現。 雖然到目前爲止的實施尚未達到咱們在第6節中提出的完整提案,但它們是朝着正確方向邁出的一步,而且已經產生了有用的經驗教訓,這些經驗告訴了它的發展。 在這裏,咱們總結了這些經驗:
Some operations only work (efficiently) on watermarked event time attributes. Whether performing an aggregation on behalf of the user or executing overtly stateful business logic, an implementer must have a way to maintain finite state over infinite input. Event time semantics, particularly watermarks, are critical. State for an ongoing aggregation or stateful operator can be freed when the watermark is sufficiently advanced that the state won’t be accessed again.
某些操做僅對水印事件時間屬性有效(高效)。 不管是表明用戶執行聚合仍是執行公開的有狀態業務邏輯,實現者都必須有辦法在無限輸入上維持有限狀態。 事件時間語義,特別是水印,是相當重要的。 當水印足夠先進以致於再也不訪問狀態時,能夠釋放正在進行的聚合或有狀態運算符的狀態。
Operators may erase watermark alignment of event time attributes. Event time processing requires that event timestamps are aligned with watermarks. Since event timestamps are exposed as regular attributes, they can be referenced in arbitrary expressions. Depending on the expression, the result may or may not remain aligned with the watermarks; these cases need to be taken into account during query planning. In some cases it is possible to preserve watermark alignment by adjusting the watermarks, and in others an event time attribute loses its special property.
操做符能夠擦除事件時間屬性的水印對齊。 事件時間處理要求事件時間戳與水印對齊。 因爲事件時間戳做爲常規屬性公開,所以能夠在任意表達式中引用它們。 根據表達式,結果可能會或可能不會與水印保持一致; 在查詢計劃期間須要考慮這些狀況。 在某些狀況下,能夠經過調整水印來保持水印對齊,而在其餘狀況下,事件時間屬性會失去其特殊屬性。
Time-varying relations might have more than one event time attribute. Most stream processing systems that feature event time processing only support a single event time attribute with watermarks. When joining two TVRs it can happen that the event time attributes of both input TVRs are preserved in the resulting TVR. One approach to address this situation is to "hold-back" the watermark such that all event time attributes remain aligned.
時變關係可能具備多個事件時間屬性。 大多數具備事件時間處理功能的流處理系統僅支持帶有水印的單個事件時間屬性。 當鏈接兩個TVR時,可能發生兩個輸入TVR的事件時間屬性被保留在獲得的TVR中。 解決這種狀況的一種方法是「保持」水印,使得全部事件時間屬性保持對齊。
Reasoning about what can be done with an event time attribute can be difficult for users. In order to define a query that can be efficiently executed using event time semantics and reasoning, event time attributes need to be used at specific positions in certain clauses, for instance as an ORDER BY attribute in an OVER clause. These positions are not always easy to spot and failing to use event time attributes correctly easily leads to very expensive execution plans with undesirable semantics.
關於用事件時間屬性能夠作什麼的推理對於用戶來講多是困難的。 爲了定義可使用事件時間語義和推理有效執行的查詢,須要在某些子句中的特定位置使用事件時間屬性,例如做爲OVER子句中的ORDER BY屬性。 這些位置並不老是容易發現,而且未能正確使用事件時間屬性容易致使具備不指望語義的很是昂貴的執行計劃。
Reasoning about the size of query state is sometimes a necessary evil. Ideally, users should not need to worry about internals when using SQL. However, when consuming unbounded input user intervention is useful or sometimes necessary. So we need to consider what metadata the user needs to provide (active interval for attribute inserts or updates, e.g. sessionId) and also how to give the user feedback about the state being consumed, relating the physical computation back to their query.
關於查詢狀態的大小的推理有時是必要的惡。 理想狀況下,用戶在使用SQL時沒必要擔憂內部問題。 可是,當消費無限制輸入時,用戶干預是有用的或有時是必要的。 所以,咱們須要考慮用戶須要提供哪些元數據(屬性插入或更新的活動間隔,例如sessionId),以及如何向用戶提供有關正在消耗的狀態的反饋,將物理計算與其查詢相關聯。
It is useful for users to distinguish between streaming and materializing operators. In Flink and Beam, users need to reason explicitly about which operators may produce updating results, which operators can consume updating results, and the effect of operators on event time attributes. These low-level considerations are inappropriate for SQL and have no natural place in relational semantics; we need materialization control extensions that work well with SQL.
用戶區分流式傳輸和物化操做符很是有用。 在Flink和Beam中,用戶須要明確地說明哪些操做符可能產生更新結果,哪些操做符可使用更新結果,以及操做符對事件時間屬性的影響。 這些低級別的考慮因素不適合SQL,而且在關係語義中沒有天然的位置; 咱們須要適用於SQL的物化控制擴展。
Torrents of updates: For a high-throughput stream, it is very expensive to issue updates continually for all derived values. Through materialization controls in Flink and Beam, this can be limited to fewer and more relevant updates.
更新的浪潮:對於高吞吐量流,爲全部派生值不斷髮布更新是很是昂貴的。 經過Flink和Beam中的物化控制,這能夠限於更少和更相關的更新。
Work presented here is part of an initial effort to standardize streaming SQL and define our emerging position on its features. In this section, we will first briefly discuss some ways in which SQL already supports streaming, after which we will present our proposed streaming extensions.
這裏介紹的工做是標準化流SQL的初步工做的一部分,並定義了咱們在其功能上的新興位置。 在本節中,咱們將首先簡要討論SQL已經支持流式傳輸的一些方法,以後咱們將介紹咱們提出的流式傳輸擴展。
SQL as it exists today already includes support for a number of streaming related approaches. Though not sufficient to cover all relevant streaming use cases, they provide a good foundation upon which to build, notably:
現在存在的SQL已經包括對許多流相關方法的支持。 雖然不足以涵蓋全部相關的流媒體用例,但它們爲構建提供了良好的基礎,特別是:
Queries are on table snapshots: As a classical SQL table evolves, queries can execute on their current contents. In this way, SQL already plays nicely with relations over time, albeit only in the context of static snapshots.
查詢在錶快照上:隨着經典SQL表的發展,查詢能夠對其當前內容執行。 經過這種方式,SQL已經能夠很好地處理關係,儘管只是在靜態快照的上下文中。
Materialized Views: Views (semantically) and materialized views (physically) map a query pointwise over a TVR. At any moment, the view is the result of a query applied to its inputs at that moment. This is an extremely useful initial step in stream processing.
物化視圖:視圖(語義上)和物化視圖(物理上)在TVR上逐點映射查詢。 在任什麼時候候,視圖都是在那一刻應用於其輸入的查詢的結果。 這是流處理中很是有用的初始步驟。
Temporal tables: Temporal tables embody the idea of a time-varying relation, and provide the ability to query snapshots of the table from arbitrary points of time in the past via AS OF SYSTEM TIME operators.
時態表:時態表體現了時變關係的概念,並提供了經過AS OF SYSTEM TIME運算符從過去的任意時間點查詢表的快照的能力。
MATCH RECOGNIZE: The MATCH_RECOGNIZE clause was added with SQL:2016 [1]. When combined with event time semantics, this extension is highly relevant to streaming SQL as it enables a new class of stream processing use case, namely complex event processing and pattern matching [18].
MATCH RECOGNIZE:MATCH_RECOGNIZE子句添加了SQL:2016 [1]。 當與事件時間語義結合使用時,此擴展與流式SQL高度相關,由於它啓用了一類新的流處理用例,即復瑣事件處理和模式匹配[18]。
There are no extensions necessary to support time-varying relations. Relational operators as they exist today already map one time-varying relation to another naturally. 1 To enable event time semantics in SQL, a relation may include in its schema columns that contain event timestamps. Query execution requires knowledge of which column(s) correspond to event timestamps to associate them with watermarks, described below. The metadata that a column contains event timestamps is to be stored as part of or alongside the schema. The timestamps themselves are used in a query like any other data, in contrast to CQL where timestamps themselves are metadata and KSQL which implicitly references event time attributes that are declared with the schema.
沒有必要擴展來支持時變關係。 現在存在的關係運算符已經天然地將一個時變關係映射到另外一個。 1要在SQL中啓用事件時間語義,關係能夠在其架構中包含包含事件時間戳的列。 查詢執行須要知道哪些列對應於事件時間戳以將它們與水印相關聯,以下所述。 列包含事件時間戳的元數據將做爲模式的一部分存儲或與模式一塊兒存儲。 時間戳自己在任何其餘數據的查詢中使用,與時間戳自己是元數據的CQL和隱式引用用模式聲明的事件時間屬性的KSQL相反。
To support unbounded use cases, watermarks are also available as semantic inputs to standard SQL operators. This expands the universe of relational operators to include operators that are not pointwise with respect to time, as in Section 6.5.2. For example, rows may be added to an output relation based only on the advancement of the watermark, even when no rows have changed in the input relation(s).
爲了支持無限用例,水印也可用做標準SQL運算符的語義輸入。 這擴展了關係運算符的範圍,以包括與時間無關的運算符,如第6.5.2節所述。 例如,即便在輸入關係中沒有行改變時,也能夠僅基於水印的推動將行添加到輸出關係。
Extension 1 (Watermarked event time column). An event time column in a relation is a distinguished column of type TIMESTAMP with an associated watermark. The watermark associated with an event time column is maintained by the system as time-varying metadata for the relation as a whole and provides a lower bound on event timestamps that may be added to the column.
擴展1(水印事件時間列)。 關係中的事件時間列是具備相關水印的TIMESTAMP類型的區分列。 與事件時間列相關聯的水印由系統維護爲整個關係的時變元數據,並提供可添加到列的事件時間戳的下限。
When processing over an unbounded stream, an aggregate projected in a query of the form SELECT ... GROUP BY ... is complete when it is known that no more rows will contribute to the aggregate. Without extensions, it is never known whether there may be more inputs that contribute to a grouping. Under event time semantics, the watermark gives a measure of completeness and can determine when a grouping is complete based on event time columns. This corresponds to the now-widespread notion of event-time windowing. We can adapt this to SQL by leveraging event time columns and watermarks.
當處理無界流時,當已知再也不有行將對聚合有貢獻時,在SELECT ... GROUP BY ...形式的查詢中投影的聚合完成。 若是沒有擴展,就不可能知道是否有更多的輸入有助於分組。 在事件時間語義下,水印給出了完整性的度量,而且能夠基於事件時間列肯定分組什麼時候完成。 這對應於如今廣泛存在的事件時間窗口的概念。 咱們能夠經過利用事件時間列和水印來使其適應SQL。
Extension 2 (Grouping on event timestamps). When a GROUP BY clause contains a grouping key that is an event time column, any grouping where the key is less than the watermark for that column is declared complete, and further inputs that would contribute to that group are dropped (in practice, a configurable amount of allowed lateness is often needed, but such a mechanism is beyond the scope of this paper; for more details see Chapter 2 of [6]) Every GROUP BY clause with an unbounded input is required to include at least one event-time column as a grouping key.
擴展2(對事件時間戳進行分組)。 當GROUP BY子句包含做爲事件時間列的分組鍵時,任何鍵小於該列水印的分組都將被聲明爲完成,而且將刪除對該組有貢獻的其餘輸入(實際上,可配置) 一般須要延遲容許的數量,可是這種機制超出了本文的範圍;更多細節見[6]的第2章)每一個具備無界輸入的GROUP BY子句都必須包含至少一個事件時間列 做爲分組鍵。
It is rare to group by an event time column that contains original event timestamps unless you are trying to find simultaneous events. Instead, event timestamps are usually mapped to a distinguished end time after which the grouping is completed. In the example from Section 4, bid timestamps are mapped to the end of the ten minute interval that contains them.We propose adding built-in table-valued functions that augment a relation with additional event timestamp columns for these common use cases (while leaving the door open for additional built-in or custom TVFs in the future).
除非您嘗試查找同時發生的事件,不然不多按包含原始事件時間戳的事件時間列進行分組。 相反,事件時間戳一般映射到分組完成以後的標識結束時間。 在第4節的示例中,出價時間戳被映射到包含它們的十分鐘間隔的末尾。咱們建議添加內置表值函數,以增長與這些常見用例的附加事件時間戳列的關係(離開時) 爲將來的其餘內置或定製TVF敞開大門。
Extension 3 (Event-time windowing functions). Add (as a starting point) built-in table-valued functions Tumble and Hop which take a relation and event time column descriptor as input and return a relation with additional event-time interval columns as output, and establish a convention for the eventtime interval column names.
擴展3(事件時間窗口函數)。 添加(做爲起點)內置表值函數Tumble和Hop,它將關係和事件時間列描述符做爲輸入並返回與附加事件 - 時間間隔列的關係做爲輸出,併爲事件時間間隔創建約定 列名。
The invocation and semantics of Tumble and Hop are below. There are other useful event time windowing functions used in streaming applications which the SQL standard may consider adding, but these two are extremely common and illustrative. For brevity, we show abbreviated function signatures and describe the parameters in prose, then illustrate with example invocations.
Tumble和Hop的調用和語義以下。 在SQL應用程序可能會考慮添加的流應用程序中還有其餘有用的事件時間窗口函數,但這兩個函數很是常見且具備說明性。 爲簡潔起見,咱們顯示縮寫的函數簽名並描述散文中的參數,而後經過示例調用進行說明。
Tumbling (or "fixed") windows partition event time into equally spaced disjoint covering intervals. Tumble takes three required parameters and one optional parameter:
翻滾(或「固定」)窗口將事件時間劃分爲等間隔的不相交覆蓋間隔。 Tumble須要三個必需參數和一個可選參數:
Tumble (data , timecol , dur , [ offset ])
The return value of Tumble is a relation that includes all columns of data as well as additional event time columns wstart and wend. Here is an example invocation on the Bid table from the example in Section 4:
Tumble的返回值是一個包含全部數據列以及附加事件時間列wstart和wend的關係。 如下是第4節中示例的Bid表上的示例調用:
Users can group by wstart or wend; both result in the same groupings and, assuming ideal watermark propagation, the groupings reach completeness at the same time. For example, grouping by wend:
用戶能夠經過wstart或wend進行分組; 二者都產生相同的分組,而且假設理想的水印傳播,分組同時達到完整性。 例如,按wend分組:
Hopping (or "sliding") event time windows place intervals of a fixed size evenly spaced across event time. Hop takes four required parameters and one optional parameter. All parameters are analogous to those for Tumble except for hopsize, which specifies the duration between the starting points (and endpoints) of the hopping windows, allowing for overlapping windows (hopsize < dur , common) or gaps in the data (hopsize > dur , rarely useful).
跳躍(或「滑動」)事件時間窗口在事件時間內均勻地間隔固定大小的間隔。 Hop須要四個必需參數和一個可選參數。 除了hopsize以外,全部參數都相似於Tumble的參數,hopsize指定跳躍窗口的起點(和端點)之間的持續時間,容許重疊窗口(hopsize <dur,common)或數據中的間隙(hopsize> dur, 不多有用)。
The return value of Hop is a relation that includes all columns of data as well as additional event time columns wstart and wend. Here is an example invocation on the Bid table from the example in Section 4:
Hop的返回值是包含全部數據列以及附加事件時間列wstart和wend的關係。 如下是第4節中示例的Bid表上的示例調用:
Users can group by wstart or wend with the same effect, as with tumbling windows. For example:
用戶能夠按開頭或結尾進行分組,效果與翻滾窗口相同。 例如:
Using table-valued functions improves on the current state of implementations in the following ways:
使用表值函數能夠經過如下方式改進當前的實現狀態:
GROUP BY is truly a grouping of rows according to a column’s value. In Calcite, Beam, and Flink, GROUP BY HOP(...) violates relational semantics by causing multiple input rows.
GROUP BY根據列的值真正是一組行。 在Calcite,Beam和Flink中,GROUP BY HOP(...)經過引發多個輸入行來違反關係語義。
A more uniform notation for all window functions. The near-trivial Tumble has the same general form as the input-expanding Hop, and using a table-valued functions allows adding a wide variety of more complex functionality (such as calendar windows or sessionization) with a similar look-and-feel.
全部窗口函數的統一表示法。 近乎平凡的Tumble具備與輸入擴展Hop相同的通常形式,而且使用表值函數容許添加具備相似外觀的各類更復雜的功能(例如日曆窗口或會話化)。
Engines have flexibility in howthey implement these table-valued functions. Rows in the output may appear and disappear as appropriate according to downstream materialization requirements.
引擎在如何實現這些表值函數方面具備靈活性。 根據下游物化要求,輸出中的行可能會出現和消失。
The last piece of our proposal centers around materialization controls, allowing users flexibility in shaping how and when the rows in their TVRs are materialized over time.
咱們的最後一條建議圍繞物化控制,容許用戶靈活地塑造TVR中的行如何以及什麼時候實現。
The how aspect of materialization centers around the choice of materializing a TVR as a table or stream. The long-standing default for relations has been to materialize them as tables. And since this approach is completely compatible with the idea of swapping pointin-time relations with time-varying relations, no changes around materializing a table are necessary. However, in certain situations, materializing a stream-oriented changelog view of the TVR is desirable.
物化的方面主要圍繞將TVR實現爲表格或流的選擇。 關係的長期默認一直是將它們做爲表格實現。 而且因爲這種方法與使用時變關係交換點時間關係的想法徹底兼容,所以不須要實現表的實現。 然而,在某些狀況下,實現TVR的面向流的更改日誌視圖是可取的。
In these cases, we require some way to signal to the system that the changelog of the relation should be materialized.We propose the use of a new EMIT STREAM modifier on a query to do this. Recall our original motivating query results from Listing 3, which rendered a table view of our example query. By adding EMIT STREAM at the top-level, we materialize the stream changelog for the TVR instead of a point-in-time snapshot of the relation itself:
在這些狀況下,咱們須要某種方式向系統發出關於應該實現關係的更改日誌的信號。咱們建議在查詢中使用新的EMIT STREAM修飾符來執行此操做。 回想一下清單3中的原始激勵查詢結果,它們呈現了咱們的示例查詢的表格視圖。 經過在頂層添加EMIT STREAM,咱們實現了TVR的流更改日誌,而不是關係自己的時間點快照:
Note that there are a number of additional columns included in the STREAM version:
請注意,STREAM版本中包含許多其餘列:
A changelog only has multiple revisions to a row when there is a aggregation present in the query resulting in changes to the row over time.
當查詢中存在聚合時,更改日誌僅對行進行屢次修訂,從而致使行隨時間發生更改。
Extension 4 (Stream Materialization). EMIT STREAM results in a time-varying relation representing changes to the classical result of the query. In addition to the schema of the classical result, the change stream includes columns indicating: whether or not the row is a retraction of a previous row, the changelog processing time offset of the row, a sequence number relative to other changes to the same event time grouping.
擴展4(流實現)。 EMIT STREAM致使時變關係,表示查詢的經典結果的變化。 除了經典結果的模式以外,更改流還包括指示:行是不是前一行的縮回,行的更改日誌處理時間偏移,相對於同一事件的其餘更改的序列號的列時間分組。
One could imagine other options, such as allowing materialization of deltas rather than aggregates, or even entire relations a la CQL’s Rstream. These could be specified with additional modifiers, but are beyond the scope of this paper.
人們能夠想象其餘選擇,例如容許實現增量而不是聚合,甚至整個關係都是CQL的Rstream。 這些可使用其餘修飾符指定,但超出了本文的範圍。
As far as equaling the output of the original CQL query, the STREAM keyword is a step in the right direction, but it’s clearly more verbose, capturing the full evolution of the highest bidders for the given 10-minute event time windows as data arrive, whereas the CQL version provided only a single answer per 10-minute window once the input data for that window was complete. To tune the stream output to match the behavior of CQL (but accommodating out-of-order input data), we need to support materialization delay.
至於等同於原始CQL查詢的輸出,STREAM關鍵字是朝着正確方向邁出的一步,但它顯然更加冗長,隨着數據到達,捕獲給定10分鐘事件時間窗口的最高出價者的徹底演變, 而一旦該窗口的輸入數據完成,CQL版本每10分鐘窗口只提供一個答案。 要調整流輸出以匹配CQL的行爲(可是容納無序輸入數據),咱們須要支持實現延遲。
The when aspect of materialization centers around the way relations evolve over time. The standard approach is on a record-by-record basis: as DML operations such as INSERT and DELETE are applied to a relation, those changes are immediately reflected. However, when dealing with aggregate changes in relations, it’s often beneficial to delay the materialization of an aggregate in some way. Over the years, we’ve observed two main categories of delayed materialization in common use: completeness delays and periodic delays.
物化的時間方面圍繞着關係隨時間演變的方式。 標準方法是逐個記錄的:當INSERT和DELETE等DML操做應用於關係時,這些更改會當即反映出來。 可是,在處理關係中的聚合變化時,以某種方式延遲聚合的實現一般是有益的。 多年來,咱們觀察到兩種主要的常見延遲物化類型:完整性延遲和週期性延遲。
Completeness delays: Event time windowing provides a means for slicing an unbounded relation into finite temporal chunks, and for use cases where eventual consistency of windowed aggregates is sufficient, no further extensions are required. However, some use cases dictate that aggregates only be materialized when their inputs are complete, such as queries for which partial results are too unstable to be of any use, such a query which determines if a numeric sum is even or odd. These still benefit from watermark-driven materialization even when consumed as a table.
完整性延遲:事件時間窗口提供了將無界關係切片爲有限時間塊的方法,對於窗口聚合的最終一致性足夠的用例,不須要進一步擴展。 可是,一些用例規定聚合只有在輸入完成時才能實現,例如部分結果太不穩定而沒法使用的查詢,這種查詢肯定數字和是偶數仍是奇數。 即便做爲表格消費,這些仍然受益於水印驅動的物化。
Recall again our query from Listing 4 where we queried the table version of our relation at 8:13. That query presented a partial result for each window, capturing the highest priced items for each tumbling window at that point in processing time. For use cases where presenting such partial results is undesirable, we propose the syntax EMIT AFTER WATERMARK to ensure the table view would only materialize rows whose input data were complete. In that way, our query at 8:13 would return an empty table:
再回想一下清單4中的查詢,咱們在8:13查詢了關係的表格版本。 該查詢爲每一個窗口顯示了部分結果,捕獲了處理時間中該點的每一個翻滾窗口的最高價格項目。 對於不但願出現這種部分結果的用例,咱們提出語法EMIT AFTER WATERMARK以確保表視圖僅實現輸入數據完整的行。 這樣,咱們在8:13的查詢將返回一個空表:
If we were to query again at 8:16, once the watermark had passed the end of the first window, we’d see the final result for the first window, but still none for the second:
若是咱們在8:16再次查詢,一旦水印經過了第一個窗口的末尾,咱們將看到第一個窗口的最終結果,但第二個窗口仍然沒有:
And then if we queried again at 8:21, after the watermark had passed the end of the second window, we would finally have the final answers for both windows:
而後,若是咱們在8:21再次查詢,在水印經過第二個窗口結束後,咱們最終會獲得兩個窗口的最終答案:
We also can use STREAM materialization to concisely observe the evolution of the result, which is analogous to what the original CQL query would produce:
咱們還可使用STREAM實現來簡明地觀察結果的演變,這相似於原始CQL查詢將產生的:
Comparing this to the evolution of the streamed changelog in Section 6.5.1 illustrates the difference with AFTER WATERMARK:
將其與6.5.1節中流式更改日誌的演變進行比較,說明了與AFTER WATERMARK的區別:
The most common example of delayed stream materialization is notification use cases, where polling the contents of an eventually consistent relation is infeasible. In this case, it’s more useful to consume the relation as a stream which contains only aggregates whose input data is known to be complete. This is the type of use case targeted by the original CQL top bids query.
延遲流實現的最多見示例是通知用例,其中輪詢最終一致關係的內容是不可行的。 在這種狀況下,將關係用做僅包含已知輸入數據已完成的聚合的流更有用。 這是原始CQL最高出價查詢所針對的用例類型。
Extension 5 (Materialization Delay: Completeness). When a query has an EMIT AFTER WATERMARK modifier, only complete rows from the results are materialized.
擴展5(具體化延遲:完整性)。 當查詢具備EMIT AFTER WATERMARK修改器時,僅實現結果中的完整行。
Periodic delays: The second delayed materialization use case we care about revolves around managing the verbosity of an eventually consistent STREAM changelog. The default STREAM rendering, as we saw above, provides updates every time any row in the relation changes. For high volume streams, such a changelog can be quite verbose. In those cases, it is often desirable to limit how frequently aggregates in the relation are updated. To do so, we propose the addition of an AFTER DELAY modifier to the EMIT clause, which dictates a delay imposed on materialization after a change to a given aggregate occurs, for example:
按期延遲:咱們關心的第二個延遲實現用例圍繞管理最終一致的STREAM更改日誌的詳細程度。 如上所述,默認的STREAM呈現每次關係中的任何行發生更改時都會提供更新。 對於高容量流,這樣的更改日誌可能很是冗長。 在這些狀況下,一般但願限制關係中聚合的更新頻率。 爲此,咱們建議在EMIT子句中添加一個AFTER DELAY修飾符,該修飾符指示在更改給定聚合後發生的實現延遲,例如:
In this example, multiple updates for each of the windows are compressed together, each within a six-minute delay from the first change to the row.
在此示例中,每一個窗口的多個更新被壓縮在一塊兒,每一個更新在從第一次更改到行的六分鐘延遲內。
Extension 6 (Periodic Materialization). When a query has EMIT AFTER DELAY d, rows are materialized with period d (instead of continuously).
擴展6(按期實現)。 當查詢具備EMIT AFTER DELAY d時,將使用句點d(而不是連續)實現行。
It’s also possible to combine AFTER DELAY modifiers with AFTER WATERMARK modifiers to provide the early/on-time/late pattern [6] of repeated periodic updates for partial result rows, followed by a single on-time row, followed by repeated periodic updates for any late rows.
也能夠將AFTER DELAY修改器與AFTER WATERMARK修改器結合使用,爲部分結果行提供重複按期更新的早期/準時/後期模式[6],而後是單個準時行,而後重複按期更新 任何晚行。
Extension 7 (CombinedMaterialization Delay). When a query has EMIT AFTER DELAYd AND AFTER WATERMARK, rows are materialized with period d as well as when complete.
擴展7(CombinedMaterialization Delay)。 當查詢具備EMIT AFTER DELAYd AND AFTER WATERMARK後,行具備句點d以及完成時的具體化。
Streaming SQL is an exercise in manipulating relations over time. The large body of streaming SQL literature combined with recent efforts in the modern streaming community form a strong foundation for basic streaming semantics, but room for improvement remains in the dimensions of usability, flexibility, and robust event-time processing.We believe that the three contributions proposed in this paper, (1) pervasive use of time-varying relations, (2) robust event-time semantics support, and (3) materialization control can substantially improve the ease-of-use of streaming SQL. Moreover, they will broaden the menu of available operators to not only include the full suite of point-in-time relational operators available in standard SQL today, but also extend the capabilities of the language to operators that function over time to shape when and how relations evolve.
流式SQL是一種隨着時間的推移操縱關係的練習。 大量的流式SQL文檔與最近在現代流媒體社區中的努力相結合,造成了基本流式語義的堅實基礎,但改進的空間仍然在於可用性,靈活性和強大的事件時間處理等方面。咱們相信三者 本文提出的貢獻,(1)廣泛使用時變關係,(2)強大的事件時間語義支持,以及(3)物化控制能夠大大提升流SQL的易用性。 此外,它們將擴展可用運算符的菜單,不只包括當今標準SQL中提供的全套時間點關係運算符,還將語言的功能擴展到隨時間變化的運算符,以便什麼時候以及如何運行 關係發展。
Expanded/custom event-time windowing: Although the windowing TVFs proposed in Section 6.4 are common, Beam and Flink both provide many more, e.g., transitive closure sessions (periods of contiguous activity), keyed sessions (periods with a common session identifier, with timeout), and calendar-based windows. Experience has also shown that prebuilt solutions are never sufficient for all use cases (Chapter 4 of [6]); ultimately, users should be able to utilize the power of SQL to describe their own custom-windowing TVFs.
擴展/自定義事件時間窗口:儘管第6.4節中提出的窗口TVF是常見的,可是Beam和Flink都提供了更多,例如,傳遞閉包會話(連續活動的時段),鍵控會話(具備公共會話標識符的時段,具備 超時)和基於日曆的窗口。 經驗還代表,預建解決方案永遠不足以知足全部用例([6]的第4章); 最終,用戶應該可以利用SQL的力量來描述他們本身的自定義窗口TVF。
Time-progressing expressions: Computing a view over the tail of a stream is common, for example counting the bids of the last hour. Conceptually, this can be done with a predicate like (bidtime > CURRENT_TIME - INTERVAL ’1’ HOUR). However, the SQL standard defines that expressions like CURRENT_TIME are fixed at query execution time. Hence, we need expressions that progress over time.
時間進度表達式:計算流尾部的視圖很常見,例如計算最後一小時的出價。 從概念上講,這能夠經過謂詞來完成(bidtime> CURRENT_TIME - INTERVAL'1'HOUR)。 可是,SQL標準定義了像CURRENT_TIME這樣的表達式在查詢執行時是固定的。 所以,咱們須要隨時間推移的表達式。
Correlated access to temporal tables A common use case in streaming SQL is to enrich a table with attributes from a temporal table at a specific point in time, such as enriching an order with the currency exchange rate at the time when the order was placed. Currently, only a temporal version specified by a fixed literal AS OF SYSTEM TIME can be accessed. To enable temporal tables for joins, the table version needs to be accessible via a correlated join attribute.
對時態表的相關訪問流式SQL中的一個常見用例是在特定時間點使用來自時態表的屬性來豐富表,例如使用下訂單時的貨幣匯率來豐富訂單。 目前,只能訪問固定文字AS OF SYSTEM TIME指定的時間版本。 要爲鏈接啓用時態表,須要經過相關的鏈接屬性訪問表版本。
Streaming changelog options: As alluded to in Section 6.5.1, more options for stream materialization exist, and EMIT should probably be extended to support them. In particular, rendering a stream changelog as a sequence of deltas.
流更改日誌選項:如第6.5.1節中所提到的,存在更多用於流實現的選項,而且可能應該擴展EMIT以支持它們。 特別是,將流更改日誌呈現爲一系列增量。
Nested EMIT: Though we propose limiting the application of EMIT to the top level of a query, an argument can be made for the utility of allowing EMIT at any level of nested query. It is worthwhile to explore the tension between additional power and additional complexity this change would impose.
嵌套EMIT:雖然咱們建議將EMIT的應用程序限制在查詢的頂層,但能夠在任何嵌套查詢級別容許EMIT的實用程序參數。 值得探討這種變化所帶來的額外功率與額外複雜性之間的緊張關係。
Graceful evolution: Streaming queries by definition exist over an extended period of time, but software is never done nor perfect: bugs are uncovered, requirements evolve, and over time, long-running queries must change. The stateful nature of these queries imposes new challenges regarding the evolution of intermediate state. This remains an unsolved problem for the streaming community in general, but its relevance in the more abstract realm of SQL is all the greater.
優雅的演變:按定義存在流式查詢在很長一段時間內存在,但軟件從未完成也不完美:錯誤被發現,需求不斷髮展,而且隨着時間的推移,長時間運行的查詢必須改變。 這些查詢的有狀態性質對中間狀態的演變提出了新的挑戰。 對於流社區而言,這仍然是一個未解決的問題,但它在更抽象的SQL領域中的相關性更大。
More rigorous formal definitions of semantics: Although we’ve tried to provide semi-formal analyses of concepts presented in this paper where applicable, we as a streaming community still lack a true formal analysis of what streaming means, particularly when applied to some of the more subtle aspects of event-time processing such as watermarks and materialization controls. A more rigorous survey of modern streaming concepts would be a welcome and beneficial addition to the literature.
更嚴格的語義正式定義:雖然咱們試圖在適用的狀況下提供本文中提出的概念的半正式分析,但咱們做爲流媒體社區仍然缺少對流式傳輸方式的真實形式分析,特別是當應用於某些流式傳輸時 事件時間處理的更微妙的方面,如水印和物化控制。 對現代流媒體概念進行更嚴格的調查將是文獻中受歡迎且有益的補充。