CESSNA: Resilient Edge-Computing

時間 2019-11-20

標籤 cessna resilient edge computing 简体版

原文原文鏈接

CESSNA: 彈性邊緣計算

本文爲SIGCOMM 2018 Workshop (Mobile Edge Communications, MECOMM)論文。web

筆者翻譯了該論文。因爲時間倉促，且筆者英文能力有限，錯誤之處在所不免；歡迎讀者批評指正。數據庫

本文及翻譯版本僅用於學習使用。若是有任何不當，請聯繫筆者刪除。編程

本文包含5位共同做者，UC Berkeley的Yotam Harchol、Aisha Mushtaq和James McCauley，NYU和ICSI的Aurojit Panda以及UC Berkeley和ICSI的Scott Shenker。後端

ABSTRACT （摘要）

The introduction of computational resources at the network edge has moved us from a Client-Server model to a Client-Edge-Server model. By offloading computation from clients and/or servers, this approach can reduce response latency, backbone bandwidth, and computational requirements on clients. While this is an attractive paradigm for many applications, particularly 5G mobile networks and IoT devices, it raises the question of how one can design such a client-edge-server system to tolerate edge failures and client mobility. The key challenge is to ensure correctness when the edge processing is stateful (so the processing depends on state it has previously seen from the client and/or server). In this paper we propose an initial design for meeting this challenge called Client-Edge-Server for Stateful Network Applications (CESSNA). 瀏覽器

網絡邊緣處計算資源的引入將咱們由客戶端-服務器模式轉向客戶端-邊緣-服務器模式。經過卸載客戶端和/或服務器的計算，這一方法能夠下降響應延遲、骨幹帶寬和客戶端的計算需求。儘管這對許多應用來講是具備吸引力的結構（特別是5G移動網絡和IoT設備），它提出以下問題：如何設計這種容忍邊緣故障和客戶端移動性的客戶端-邊緣-服務器系統。關鍵挑戰是在邊緣處理是有狀態時（處理依賴於客戶端和/或服務器的前期狀態）保證正確性。本文提出一種知足這一挑戰的初始設計，稱爲爲有狀態網絡應用的客戶端-邊緣-服務器（CESSNA）。緩存

1 INTRODUCTION （摘要）

The recent introduction of compute and storage resources at the network edge allows service providers to oﬀer lower latency and higher throughput to geographically nearby content and computation, and this in turn allows applications such as sensors and IoT devices to reduce their upstream bandwidth requirements by pre-processing data at the edge. 安全

最近，網絡邊緣處計算和存儲資源的引入使得服務提供者能夠爲地理位置接近的內容和計算提供更低延遲和更高帶寬，同時這相應地使得應用（如傳感器和IoT設備）能夠經過在邊緣處預處理數據下降上行帶寬需求。服務器

Network applications have long been based on the client-server paradigm, where a stateful server (or set of servers) provides services to multiple clients. While consistency issues of the client-server model have been thoroughly studied, the addition of a stateful edge processor in between the two complicates the consistency problem. 網絡

網絡應用長期基於客戶端-服務器模式；這種模式下，一個狀態服務端（或服務端集合）爲多個客戶端提供服務。儘管客戶端-服務器模型的一致性已經被深刻的研究，服務端和客戶端之間的額外的有狀態邊緣處理器使一致性問題複雜化。session

To illustrate the problem, consider a simple example of an edge that serves as a packet counter: The server is not interested in getting every packet from the client, but only in the total number of packets the client has emitted. The edge is thus holding the counter value and if the edge fails, even if another edge is brought up immediately, the state is lost and the server never gets an accurate count of the emitted messages.

爲了解釋這個問題，考慮以下簡單示例：邊緣做爲數據包計數服務。服務器不對獲取客戶端的每一個數據包感興趣，只須要獲得客戶端發送的數據包總量。所以，邊緣維持計數值；若是邊緣故障，即便另外一個邊緣當即啓動，狀態也會丟失，服務器沒法獲取發送信息的準確計數。

This is merely an illustration of a general problem. In this paper we frst formally articulate the edge consistency problem, and then propose a general purpose framework for client-edge-server applications that provides strong consistency guarantees as described later in this paper.

這只是一個廣泛問題的示例。本文中，咱們首先形式化地表示邊緣一致性問題，而後提出一種提供強一致性保證的通用客戶端-邊緣-服務器應用框架（見本文後續章節）。

Our computational model, elaborated in Section 2, is of a computationally-capable edge that allows offloading of computation from the server, the client, or both (and can receive messages from both clients and servers). We assume that the edge is stateful but keeps state on a per-client basis: that is, a new edge process (or set of processes) is instantiated to handle each client-server session. Thus, the consistency we wish to provide for the edge would guarantee that the edge’s state always correctly reﬂects both client’s and server’s inputs (relative to this session) for as long as this session is active. Moreover, we only care about preserving consistency in the case of an edge failure; if the client or server fails, we assume the session terminates implicitly. However, note that nothing in our design precludes the use of replication or other techniques to increase the resiliency of the server (or even the client).

咱們的計算模型（在第二部分詳細說明）是一種具有計算能力的邊緣，能夠卸載服務器、客戶端或者二者的計算（也容許接收來自客戶端和服務器的信息）。咱們假設邊緣是有狀態的，可是以每客戶端爲基礎保持狀態；即，實例化一個新的邊緣過程（或一組過程）處理每一個客戶端-服務器對話。所以，只要會話是有效的，咱們爲邊緣提供的一致性能夠保證邊緣的狀態老是正確地反映客戶端和服務器的輸入（相對於此對話）。此外，咱們只關心在邊緣故障的情形下維護一致性；若是客戶端或服務器故障，咱們假設會話隱式地結束。然而，注意到咱們的設計並不阻止複製或其它增強服務器（甚至是客戶端）彈性的技術的使用。

Given this model, we would want a framework that will provide client-edge-server applications with these consistency guarantees, even though the edge may arbitrarily fail, and clients may arbitrarily move between edges. Our design aims at no or minimal modifcation to the source code of existing applications.

給定這種模型，咱們指望一種能夠提供具有一致性保證的客戶端-邊緣-服務器應用的框架，即便邊緣可能任意失效；而且，客戶端可能在邊緣間任意移動。咱們的設計目標是沒有或者不多對現有的應用源代碼的修改。

Our proposed design is built in two layers, for two diﬀerent types of edge recovery: local recovery, and remote recovery. Local recovery can be used when the failed edge and the recovered edge are physically close to each other, for example, under the same ToR switch. In this case we use a mechanism similar to the one used in [12], though we strip it down to a much-simplifed approach, which is enough due to the client-specific nature of edge state that we consider.

咱們的設計包含兩層，用於兩種不一樣類型的邊緣恢復：局部恢復和遠程恢復。局部恢復用於故障邊緣和恢復邊緣在物理上彼此接近時（如，在同一個ToR交換機之下）。在這種情形下，咱們使用一種相似於[12]中的機制，但咱們將其簡化爲一種更爲簡單的方法。鑑於咱們考慮的邊緣狀態的客戶端制定特性，這種簡化的方法是足夠的。

Remote recovery refers to the case when the two edges are far from each other, and can also apply to the client mobility case. In this case, the client and the server cooperate with the newly provisioned edge to quickly restore its state and continue the session from where it has stopped.

遠程恢復指兩個邊緣彼此距離較遠的情形，一樣能夠用於客戶端移動的情形。在這種情形下，客戶端和服務器與新提供的邊緣協做，從而快速恢復狀態，並從上次會話中止處繼續進行。

We make the following observations when coming to design such a framework:

在咱們的框架設計中咱們作出以下觀察：

The edge receives messages from two diﬀerent parties, and its state may be dependent on the exact ordering of these messages. Thus, for any process of reconstruction of the state, we must have this ordering available.
邊緣從兩個不一樣部分接收消息，其狀態可能依賴於這些消息的精確順序。所以，對於重構狀態的任意進程，咱們必須使得這一順序可用。
In order to allow such a reconstruction of the edge’s state, each endpoint (client, server) should keep a copy of the messages it sent to the edge, at least until the edge can guarantee these copies are not needed anymore.
爲了容許邊緣狀態重構，每一個端點（客戶端，服務器）須要保存一份他們發送到邊緣的消息的副本，至少須要保持到邊緣能夠確認這些副本再也不須要。
Since the edge is not reliable, the ordering of incoming messages must be stored elsewhere. One option is to send it to either the client or the server. Since each outgoing message from the edge, to either the client or the server, may indicate some state change at the edge, each such message should be accompanied with the incremental addition to the edge’s total incoming message ordering. Another option is to store it remotely. We intermix the two options in our design as we describe later.
由於邊緣是不可靠的，輸入消息的順序必須保存在其它地方。一種選擇是將順序發送到客戶端或服務器。因爲邊緣發送出去的每一個消息（到客戶端或服務器）均可能代表邊緣處的狀態改變，每條消息都須要帶有邊緣總輸入消息序列的增量加值。另外一種選擇是在遠程存儲。咱們在設計中混雜使用者兩種選擇，見下文。

Of course, actually storing all outgoing messages forever may be prohibitive for most applications in terms of memory, and may have signifcant and negative performance implications in cases of failure. Thus, we use periodic snapshots in order to limit the size of the required buﬀers, and reduce the time for session reconstruction. We describe the design in detail in Section 3.

固然，永久存儲全部的輸出消息可能限制大部分應用（就內存而言），而且在故障情形下可能帶來顯著的負面性能影響。所以，咱們使用週期性快照以限制所需的緩存大小，同時下降會話重構時間。咱們在第三部分討論設計詳情。

2 COMPUTATIONAL MODEL （計算模型）

Having computation at the edge allows one to (i) offload computation from the client (so it can be weak and/or low-powered), and/or (ii) offload computation from the server (so that responses to the client can have lower latency), and/or (iii) reduce bandwidth to the server (by doing preprocessing at the edge). Thus, the edge can be seen as extending the power of the client and/or extending the reach of the server. As a result, one cannot think of the edge as merely splitting the client code, or merely splitting the server code, but could involve a bit of both.

邊緣處計算使得(i)能夠卸載客戶端的計算（所以，客戶端能夠是較弱功能和/或低功耗的），和/或（ii）卸載服務器的計算（使得對客戶端的響應是低延遲的），和/或（iii）下降到客戶端的帶寬（在邊緣進行預處理）。所以，邊緣能夠看作客戶端的功能擴展，和/或服務器可達性的擴展。其結果是，能夠認爲邊緣僅僅是客戶端代碼的劃分，或者僅僅是客戶端代碼的劃分，但也多是同時包含客戶端和服務器的一部分。

We assume the purpose of a system is to process inputs coming from clients and the server. This processing can result in packets being emitted to the server, or to the client (or both). Thus, the logical model is one of clients sending input to the system, and perhaps receiving responses, or updates from the edge, presumably based on input from the server.

咱們假設系統的目的是處理來自客戶端和服務器的輸入。這種處理可能致使數據包被髮送到服務器或客戶端（或者二者）。所以，邏輯模型是一個客戶端發送輸入到系統，而且可能接收來自邊緣的響應或更新（可能基於來自服務器的輸入）。

Servers. We assume that clients communicate with the system by logically sending messages to a server. This is done via the edge. The backend system handles all issues of replication and recovery for these servers and any other backend processing. There are many options here, and we leave this up to the application designer. Similar to the current client-server model, a single server can service several (or all) clients, can store data in a database, and allow clients to coordinate among each other. We place no restrictions on how clients communicate and coordinate through a server, and only require that the server be able to play back messages.

服務器。咱們假設客戶端經過邏輯上發送消息給客戶端來與系統通訊。這經過邊緣完成。後端系統處理全部的複製事務，爲這些服務器和任意其它後端處理執行恢復。這裏存在不少選項，咱們將這些選擇留給應用設計者決斷。相似於當前的客戶端-服務器模型，單個服務器能夠服務於許多（或者全部）的客戶端，能夠保存數據到數據庫，並容許客戶端之間彼此協同。咱們不對客戶端如何經過服務器通訊和協同作任何限制，只要求服務器能夠回放消息。

Clients. We assume that clients do not depend on detailed timing information between messages or on latency of message response. Beyond this we allow clients to perform arbitrary processing, and to depend on arbitrary input including input from external sensors, video cameras, game controllers, etc. Finally, we assume that clients can be mobile, and as a result they might connect to diﬀerent edges over time. Thus, applications may not assume that a particular set of clients is connected to a common edge. For applications where such aggregation is desirable and where clients are immobile (such as applications which aggregate inputs from multiple sensors [1]), we treat the set of clients whose input is being aggregated as a single logical client.

客戶端。咱們假設客戶端不依賴消息間以及消息響應延遲間的詳細時序信息。在這之上，咱們容許客戶端執行任意處理，能夠依賴於任意輸入（包括來自外部傳感器的輸入，視頻攝像頭，遊戲控制器等）。最後，咱們假設客戶端能夠是移動設備，它們能夠在不一樣的時間鏈接到不一樣的邊緣。所以，應用不該該假設是特定的客戶端鏈接到同一邊緣。對於指望這種聚合以及客戶端不會移動的應用（例如，聚合來自多種傳感器的輸入的應用[1]），咱們將輸入聚合的客戶端集合做爲一個邏輯客戶端。

The Edge. Clients send a series of messages to the edge, which in turn can send messages to the client and messages to the backend server. The edge also receives messages from the backend server, and can use those in its processing of messages (for example, these messages may cause state changes in the edge). We assume that the edge application correctly and consistently handles inputs during such state changes in the absence of failures (correct behavior in the presence of failures will then be provided by CESSNA automatically). In particular, we require that the edge application be designed so that state updates are atomic and a single message (or packet) is processed using only one version of the state.

邊緣。客戶端發送一系列消息給邊緣，邊緣接着能夠發送消息給客戶端和消息給後端服務器。邊緣也接收來自後端服務器的消息，而且能夠在它的消息處理中使用這些來自後端的消息（例如，這些消息可能致使邊緣的狀態改變）。不存在故障時，咱們假設邊緣應用在狀態改變期間正確且一致性地處理輸入（故障期間的正確行爲由CESSNA自動提供）。具體地，咱們要求邊緣應用的設計中，狀態更新是原子的，而且單個消息（或數據包）只使用狀態的一個版本處理。

2.1 Problem Statement （問題描述）

Given the above assumptions, we would like to design a consistency framework for a stateful edge, such that in case of a failure of an edge instance, another instance can be provisioned and the state is correctly recovered.

給定上述假設，咱們指望爲有狀態邊緣設計一種一致性框架，使得在邊緣故障情形下，能夠提供另外的邊緣實例，且狀態能夠被正確恢復。

The correctness of the recovery process is defned such that the recovered edge continues to process input messages and emit output messages exactly the same as the original edge would have. If there was more than one plausible outcome for the original edge at the time of the failure, the outcome of the recovered edge must be one of these plausible outcomes.

恢復過程的正確性定義爲：恢復的邊緣能夠繼續處理輸入消息，併發送輸出消息，這些消息與故障邊緣應該處理和發送的消息徹底一致。若是原始邊緣在故障時有多個合理的結果，恢復的邊緣的結果也必須是其中某個合理的結果。

3 FRAMEWORK DESIGN （框架設計）

The design of our framework is illustrated in Figure 1. In this section we discuss the design of each component in the figure.

咱們框架的設計如圖1所示。本節，咱們討論圖中每一個組件的設計。

圖1：咱們框架的設計。

3.1 Edge Design (邊緣設計）

The main contribution of this work is a design for an edge runtime environment that allows seamless local and remote failover of edge applications, while preserving the correctness of the state at the edge, such that the entire failover process is transparent to the client and the server applications.

本文的主要貢獻是設計了一種邊緣運行時環境，容許邊緣應用無縫的局部和遠程故障轉移，同時維護邊緣的正確狀態；這樣，整個故障轉移過程對客戶端和服務器應用來講是透明的。

3.1.1 Runtime Engine （運行時引擎）

Edge applications are software, or more precisely, processes. They can run in virtual machines or containers, with some hypervisor underneath. Our design does not require any specific runtime engine or hypervisor. We only require it to provide the following features:

邊緣應用是軟件，或者更精確地說是進程。它們能夠運行在虛擬機或者容器中，底層是某種虛擬機管理軟件。咱們的設計不要求爲任意指定的運行時引擎或者虛擬機管理軟件。咱們只須要它提供以下特徵：

Generic software: We would like the edge to be able to run generic software as much as possible, without limiting it to specific programming languages or uncommon libraries.
通用軟件：咱們指望邊緣能夠運行越多的通用軟件越好，不限制其爲特定的編程語言或不經常使用的庫。
Efciency: The runtime engine should allow efficient running of multiple applications on top of it, in parallel.
高效：運行時引擎應該容許其上多個應用的高效並行運行。
Snapshotting: The runtime engine should be able to take snapshots of running instances and to restore an instance given such a snapshot.
快照：運行時引擎應該可以獲取運行時實例的快照，而且可以根據快照恢復實例。

Examples for existing products that provide these features are Docker [6], KVM [3], VMware [13], etc.

支持上述特性的現有產品包括Docker[6], KVM[3]和VMware[13]等。

3.1.2 Edge Storage （邊緣存儲）

Each edge runtime has some shared storage capabilities. This storage may be used by multiple instances of the same edge application when multiple client sessions benefit from sharing data – a prime example of this being content caching. The storage can be shared with instances on the same physical server, same rack, etc.

每一個邊緣運行時包含一些共享存儲的能力。當多個客戶端會話能夠從共享數據受益時，同一邊緣應用的多個實例可使用存儲 – 一個主要實例是內容緩存。存儲能夠由同一物理服務器、同一機架等共享。

Note that the shared storage must not be used for state-related storage. The state of each instance must be managed in memory for each instance, as snapshots do not include data from the shared storage. The state of an edge application must not be dependent on the presence, or the lack of presence, of a specifc item in the shared storage.

注意共享存儲不能用於狀態相關的存儲。由於快照不包括共享存儲中的數據，每一個實例的狀態必須在實例的內存中管理。邊緣應用的狀態不該該依賴於共享存儲中特定項的存在或者不存在。

3.2 Edge Recovery （邊緣恢復）

Our recovery model assumes that the edge application is fully stateful and that its state is a function of both the client and the server. If the edge application is stateless, or if its state is only a function of one side of the communication (client or server), then the recovery model can be much simplifed. We discuss these simplifcations after presenting the fully stateful model.

咱們的恢復模型假設邊緣應用具備全狀態，而且其狀態是客戶端和服務器的函數。若是邊緣應用是無狀態的，或者若是它的狀態只是通訊一端（客戶端或服務器）的狀態的函數，那麼恢復模型更爲簡單。咱們在給出全狀態模型後討論這些簡化方法。

The recovery model has two layers: local recovery, which refers to the case when the replacement edge has relatively fast storage shared with the failed edge, and remote recovery, which refers to the case when the two edges are distant from each other. We make this distinction since the local case can be solved using existing techniques as we discuss below, while for the remote case a more complex model is required.

恢復模型包含2層：局部恢復，指的是替換邊緣和故障邊緣具備相對快速的共享存儲；遠程恢復，指的是兩個邊緣的距離較遠。咱們作出如此區分是由於局部恢復可使用咱們下面將要討論的現有技術解決，而遠程恢復須要更復雜的模型。

3.2.1 Local Recovery （局部恢復）

In order to provide fast local failover, we use a simplifed version of the technique presented in FTMB [12]. In FTMB, the framework is able to recover arbitrary network functions by logging and sequencing their incoming and outgoing packets, and their nondeterministic decisions, together with periodic snapshots.

爲了提供快速的局部故障轉移，咱們使用FTMB[12]中給出的技術的簡化版本。FTMB中，框架可以經過日誌和輸入輸出數據包序列恢復任意網絡功能，同時使用週期性快照能夠恢復不肯定決斷。

In our case the situation is simpler: we treat sessions as logically independent entities, so we do not need to account for sequencing information across ﬂows or sessions. Furthermore, we assume that the edge application is deterministic and produces consistent output given identical inputs, and thus do not log outgoing packets.

在咱們的案例下，情形更爲簡單：咱們將會話看做邏輯上獨立的實體，由於咱們不須要考慮跨流或會話的序列信息。此外，咱們假設邊緣應用是肯定的，給定輸入產生一致性輸出，所以不須要記錄輸出數據包。

Because of the reasons above we also do not need to pause the application while taking a snapshot. If the underlying runtime supports live snapshotting, we can simply store the last position in the incoming packets log, then take a snapshot, and store the two together. Upon recovery, our framework would restore the snapshot and replay packets from the stored position. Since we assume traffic is sent over TCP, the application will ignore packets it has seen since the time the log position was logged until the snapshot was actually taken, as such packets simply appear to be delayed duplicates.

鑑於以上緣由，咱們不須要在執行快照時暫停應用。若是底層運行時支持現場快照，咱們能夠簡單地存儲數據包日誌的最後位置，而後執行快照，並將二者存儲在一塊兒。恢復時，咱們的框架恢復快照並從存儲的位置開始重放數據包。因爲咱們假設數據流經過TCP傳輸，應用將忽略自日誌位置被記錄到快照實際執行期間的數據包，由於這些數據包只是簡單地被延遲複製。

In order to store the snapshots and the log, the system should have some local storage. This can be per physical machine, or per rack of multiple machines, for example. This snapshot storage is not to be logically confused with the shared storage discussed in Section 3.1.2, which is used for application data storage and sharing. Physically, they can be colocated.

爲了存儲快照和日誌，系統須要具備一些局部存儲。例如，局部存儲能夠是每物理機或者每機架（多個機器）的。快照存儲邏輯上不會和3.1.2節討論的共享存儲混淆；共享存儲用於應用數據存儲和共享。物理上，他們能夠是共存的。

3.2.2 Remote Recovery and Mobility （遠程恢復和移動性）

In the case when the client fails over to a remote edge, which does not share a packet logger and snapshot storage with the failed (or previous) edge, we delegate the responsibility for the recovery to the client and/or the server.

在客戶端故障轉移到遠程邊緣的情形下（其和故障邊緣不共享數據包日誌和快照存儲），咱們將恢復的責任委託給客戶端和/或服務器。

Upon taking a snapshot, the edge runtime stores it locally, but it also sends it to one or two of the endhosts (the client and/or the server). It is only necessary to send it to one of them for correct remote recovery, and we assume that for most applications it would make sense to only send it to the client (to allow scaling at the server). The snapshot is encrypted and signed by the edge, so the client cannot see its content or tamper with it.

獲取快照後，邊緣運行時將其存儲在本地，也將其發送到端點中的一個或者兩個（客戶端和/或服務器）。爲了正確執行遠程恢復，只須要將快照發送到其中的一個端點；同時，咱們假設對大多數應用來講只將快照發送到客戶端是明智的（容許服務器擴展）。快照經邊緣加密和簽名，所以客戶端沒法看到起內容也沒法篡改。

In addition to the snapshot, the edge also sends the endhosts information that will later help a recovered edge to determine the order in which messages from both sides were processed by the failed edge. This is done such that every packet emitted by the edge, to either the client or the server, contains the most up-to-date information on this ordering.

除了快照，邊緣還須要發送終端信息，用於後期幫助恢復的邊緣肯定消息的順序，這些消息來自兩端，且由故障邊緣處理。這使得邊緣發送的每一個數據包（到客戶端或服務器）包含順序的最新信息。

Upon recovery, in order to restart the session, the client and the server send the most up-to-date snapshot they received (if any), their outgoing message logs, and their knowledge of the ordering discussed above. The newly provisioned edge is then restored to the given snapshot, and then it orders the messages given by the endhosts based on the ordering they provided. This process is illustrated in Figure 2.

恢復時，爲了重啓會話，客戶端和服務器發送他們收到的（若是有）最新快照，它們輸出信息日誌以及上述討論的它們關於順序的信息。新提供的邊緣而後恢復到指定快照，而後根據提供的序列信息爲終端消息排序。處理過程如圖2所示。

圖2：遠程恢復過程描述。圓圈數字表示事件順序。

3.2.3 Stateless or Semi-Stateful Edge (無狀態或半狀態邊緣）

The recovery mechanisms described so far assume a fully stateful edge. However these mechanisms can be simplifed in the following cases:

到目前爲止，討論的恢復機制假設邊緣是全狀態的。然而，這些機制在下列情形下能夠簡化：

If the edge application is completely stateless, we only need to be able to replay messages from the client and the server, which were already sent, but have not yet been processed. Thus, we only need the replay mechanisms described above (either using messages logged at the client, for a remote recovery, or using the local logger, in a local recovery).

若是邊緣應用是徹底無狀態的，咱們只須要可以回放來自客戶端或服務器的已發送但還沒有處理的消息。由於，咱們只須要上述討論的回放機制（對於遠程恢復使用客戶端消息日誌，或者在局部恢復中使用本地日誌）。

In the case of a semi-stateful edge application, where its state is only a function of one of the endhosts (client or server), we also need to have state snapshots, in addition to the replay mechanism that is required for the stateless case. The replay from the endhost on which state is dependent should be from the frst message after the last snapshot was taken. We do not need the interleaving ordering of client and server messages.

在半狀態邊緣應用的情形下，狀態只是某一終端（客戶端或服務器）的函數，除了無狀態情形下的回放機制外，咱們還須要狀態快照。回放來自狀態所依賴的終端的消息，這些消息包含上次快照後的第一條消息開始的後續消息。咱們不須要客戶端和服務器消息的交叉順序。

Based on the nature of the application, it can declare whether it is stateless, semi-stateful, or stateful, and the framework can then adjust its recovery mechanisms for this application accordingly.

根據應用的特性，應用能夠聲明其是無狀態的、半狀態的或有狀態的。框架能夠根據應用相應的調整其回覆機制。

3.3 Discovery （發現）

The client should be able to find the correct edge to connect to, based on the application it is connecting to, its location, etc. In our design, there is a discovery service that provides this information. Also, once the client is connected to an edge (e.g., the default one), this edge can provide it with alternative edge addresses, so in case of a failure, the client does not have to use the discovery service again but instead can immediately contact an alternative edge.

基於其鏈接的應用、位置等信息，客戶端須要可以找到其須要鏈接的邊緣。在咱們的設計中，包含提供這一信息的發現服務。當客戶端鏈接到某個邊緣（例如，默認邊緣）後，邊緣能夠爲其提供可選邊緣的地址。所以，在故障情形下，客戶端不須要再次使用發現服務，能夠直接聯繫可選擇邊緣。

3.4 The CESSNA Protocol （CESSNA協議）

Each message sent from each of the entities in our design should be sequenced, so that we could later refer to it in the ordering described above for remote recovery (local recovery uses TCP sequencing). Packets going out of the edge also contain ordering information to be stored at the endhosts.

咱們設計中每一個實體發送的消息都須要包含序列信息，這樣咱們在後期纔可能在遠程恢復過程當中獲取其順序信息（局部恢復使用TCP順序）。邊緣輸出的數據包同時包含存儲在終端的順序信息。

In order to facilitate that, we design a simple layer-7 protocol, and we wrap all packets with its header. This header may contain just a sequence number (for messages going out of the hosts), or a sequence number and ordering information (for messages going out of the edge). This header precedes any layer-7 payload in packets.

爲了更加便利，咱們設計了一種簡單的7層協議，使用報頭信息包裝數據包。報頭信息只包含序列號（對於由主機發送的消息），或者序列號和順序信息（邊緣發送出的消息）。報頭在數據包的7層負載以前。

In the current version of the CESSNA protocol, the header for messages from endhosts to the edge is 16 bytes long. The header for messages from the edge is at least 20 bytes long, depending on the frequency of messages emitted by the edge. Each message from the edge contains the diﬀerential logging of packets received by the edge. Thus, the more messages emitted by the edge, the shorter the header is. We also note that the header used by our prototype is optimized for simplicity and not size; its size could be reduced by encoding the current information using variable-length integers or by leveraging application-specifc properties.

當前版本的CESSNA協議中，終端到邊緣的消息的報頭長度是16字節。邊緣發出的消息的報頭至少是20字節，這依賴於邊緣發送消息的頻率。邊緣發送的每一個消息包含其接收數據包的差分日誌。所以，邊緣發送的消息越多，報頭長度越短。咱們注意到協議所採用的的報頭是爲簡單化而優化的，而不是爲了大小；可使用變長數據編碼當前信息或者利用應用特定地屬性下降報頭大小。

3.5 Client / Server Design （客戶端/服務器設計）

There is no actual diﬀerence between a client and a server in our design, except for their possible diﬀerent set of preferences. For example, of whether to receive snapshots from the edge or not. An endhost, whether a client or a server, is simply an application running on top of our host platform, which manages the communication with the edge.

咱們的設計中，客戶端和服務器沒有實質性差別，除了它們可能的不一樣首選項集合。例如，是否從邊緣接收快照。在終端，客戶端或者服務器只是運行在咱們主機平臺上的應用，管理與邊緣的通訊。

4 INITIAL IMPLEMENTATION （初始實現）

We have begun to implement the design described in this paper. In our implementation, we use Docker as the runtime engine for the edge. We brieﬂy examine our prototype in the following subsections.

咱們已經開始實現本文討論的設計。咱們的實現中，使用Docker做爲邊緣的運行時引擎。咱們在下面的子章節簡單介紹咱們的原型系統。

4.1 CESSNA Library （CESSNA庫）

We design a shared library to be used by applications both at the endhosts and at the edge, to take care of the serialization and deserialization of messages using the CESSNA protocol, logging messages at the endhosts, and so on.

咱們設計了可被終端（客戶端和服務器）和邊緣應用使用的共享庫，使用CESSNA協議處理消息的序列化和反序列化和終端的消息日誌等。

The runtime library is implemented in C++, and it overrides the Linux system calls for socket handling, such as connect, accept, send, recv, close (and several others). The library is loaded dynamically using the LD_PRELOAD environment variable so that applications need no modifcation in order to use it. This, for example, also enables Java and Python applications to use the library with no modifcation (as JVM and the Python interpreter use the OS socket library underneath).

運行時庫使用C++實現，並覆蓋了Linux系統的套接口處理調用，如connect、accept、send、recv、close（以及一些其它函數）。該庫使用LD_PRELOAD環境變量動態加載，這樣應用在使用該庫時不須要修改。這同時容許如Java和Python應用使用該庫而無需修改（由於JVM和Python解釋器使用OS底層的套接口庫）。

4.2 Host Agent （主機代理）

Our host agent is implemented in Python. It is responsible for all slow path tasks at the hosts: logging outgoing packets, tracking the order of packets reported by the edge, receiving snapshots, and restarting sessions in case of a failure. The agent communicates with its corresponding edge runtimes out-of-band, in parallel to the application sessions.

主機代理使用Python實現，負責主機的全部慢速路徑任務：輸出包的日誌，追蹤邊緣報告的數據包的順序，接收快照，以及故障時重啓會話。代理和其相應的邊緣運行時通訊，該通訊是帶外通訊，且與應用會話並行。

4.3 Edge Platform （邊緣平臺）

The edge platform is based on a Python agent that runs adjacent to the Docker engine to manage snapshots and communication with the host agents.

邊緣平臺基於運行於Docker引擎鄰近的Python代理，用於管理快照和與主機代理的通訊。

4.4 CESSNA Over HTTP （HTTP之上的CESSNA）

In addition to the described implementation, which requires the usage of the CESSNA library and the host agent on the client, we are also working on a version of CESSNA specifcally suited for web applications (running over HTTP/HTTPS) which does not require the installation of an additional CESSNA agent or usage of the CESSNA library at the client.

除了上述討論的實現（須要使用CESSNA庫和客戶端上的主機代理），咱們正在研發適用於web應用的CESSNA版本（運行於HTTP/HTTPS之上），它不須要安裝額外的CESSNA代理，也不須要使用客戶端的CESSNA庫。

CESSNA over HTTP is an extension to the CESSNA edge and server platforms that implements the client features in Javascript, so that the client can participate in the backup and recovery process just as in the original design, with the web browser handling the entire logic by simply running Javascript code given by the edge or the server. This code is responsible for logging outgoing requests, storing ordering information received from the edge, and managing snapshots. It is also responsible for recovery of sessions.

HTTP之上的CESSNA是對CESSNA邊緣和服務器平臺的擴展，使用Javascript實現客戶端特徵，這樣客戶端就能夠和原始設計同樣參與到備份和恢復過程。web瀏覽器經過運行邊緣或服務器的Javascript代碼處理全部的邏輯。代碼負責記錄外出請求，存儲由邊緣接收的信息的順序，並管理快照；同時負責會話恢復。

5 APPLICABILITY & DISCUSSION （適用性和討論）

Many applications that beneft from the Client-Edge-Server model do not have a stateful edge, or do not have strong consistency requirements on their edge state. For example, an audio/video conferencing application may beneft from an edge which can reﬂect streams to other clients on the same edge, and combine and transcode data being sent 「upstream」 to remote clients. If an edge is transcoding a video frame for a client and the client migrates to another edge, it may be acceptable to simply drop that frame (indeed, if the time taken by the infrastructure to provision a new edge and/or the time taken by the client to establish a session with the new edge is greater than the duration of a frame, there is no point in doing otherwise).

許多受益於客戶端-邊緣-服務器模型的應用沒有有狀態邊緣，或者對它們邊緣狀態不須要強一致性。例如，音頻/視頻會議應用可能受益於邊緣，該邊緣可以將流反射到同一邊緣的其它客戶端，同時能夠組合及轉碼上行數據到遠程客戶端。若是某個邊緣正在爲客戶端轉碼視頻幀，同時客戶端移動到另外一個邊緣，簡單地丟棄該幀是可接受的（事實上，若是基礎設施提供新邊緣的時間，以及/或客戶端與新邊緣創建會話的時間大於幀時間，作其餘選擇是沒有意義的）。

For other applications, however, maintaining state consistency during cases of failure or migration may be vital. For instance, one might imagine an edge-based system which uses deep packet inspection to provide network-based security. DPI systems typically must track the state of various protocols; if such an application does not maintain this state perfectly under failover, connections may be aborted or policy violations may occur. It is applications in this class – stateful applications which beneft from consistency during failure or migration – for which CESSNA is ideally suited. This raises two points.

然而，對於其它應用，維護故障期間或轉移期間的狀態一致性是相當重要的。例如，能夠設想一個基於邊緣的系統，系統中使用深度包檢測（DPI）提供基於網絡的安全。DPI系統一般必須追蹤多種協議的狀態；若是這種應用在故障轉移期間不維護精確地狀態，鏈接可能終端或發生策略違規。這種類型的應用（有狀態應用）得益於故障或轉移期間的一致性，這也是CESSNA理想的適用場景。這就提出兩點。

First, CESSNA does not force an application to use stronger guarantees than it needs. Many real-world Client-Edge-Server applications may contain several components, some that require consistency guarantees and some that do not. One can choose to use CESSNA for only the portion of the application that falls into the former class.

第一，CESSNA不強制應用使用比其所須要的更強的保證。許多現實中的客戶端-邊緣-服務器應用可能包含多個組件，其中一些要求一致性保證，另一些不須要。能夠爲前者選擇使用CESSNA。

Second, it is certainly possible to write applications with seamless failover without CESSNA. However, doing this on a per-application basis (and, in particular, getting it right) is typically nontrivial. The beneft of CESSNA is that it factors out this aspect of the design and provides a general solution that applications can just use.

第二，不使用CESSNA也能夠編碼具備無縫故障轉移的應用。然而，爲每一個應用都這麼作（特別地，可以確保正確）是非平凡的。CESSNA的好處是提取出設計中的這類因素，併爲應用提供可用的通用方案。

相關標籤/搜索

resilient

cessna

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。