Anveshak: Placing Edge Servers In The Wild

Anveshak:在野外放置邊緣服務器

本文爲SIGCOMM 2018 Workshop (Mobile Edge Communications, MECOMM)論文。python

筆者翻譯了該論文。因爲時間倉促,且筆者英文能力有限,錯誤之處在所不免;歡迎讀者批評指正。算法

本文及翻譯版本僅用於學習使用。若是有任何不當,請聯繫筆者刪除。數據庫

本文做者包含4位,University of Helsinki, Finland的Nitinder Mohan,Aleksandr Zavodovski,Pengyuan Zhou和Jussi Kangasharju
緩存

ABSTRACT (摘要)

Edge computing provides an attractive platform for bringing data and processing closer to users in networked environments. Several edge proposals aim to place the edge servers at a couple hop distance from the client to ensure lowest possible compute and network delay. An attractive edge server placement is to co-locate it with existing (cellular) base stations to avoid additional infrastructure establishment costs. However, determining the exact locations for edge servers is an important question that must be resolved for optimal placement. In this paper, we present Anveshak, a framework that solves the problem of placing edge servers in a geographical topology and provides the optimal solution for edge providers. Our proposed solution considers both end-user application requirements as well as deployment and operating costs incurred by edge platform providers. The placement optimization metric of Anveshak considers the request pattern of users and existing user-established edge servers. In our evaluation based on real datasets, we show that Anveshak achieves 67% increase in user satisfaction while maintaining high server utilization. 安全

邊緣計算爲網絡環境中數據和處理接近用戶提供了具備吸引力的平臺。一些邊緣方案的目標是將邊緣服務器放置到距離客戶端幾跳遠的位置,以保證可能的最低計算和網路延遲。一種具備吸引力的邊緣服務器放置是將其和現有的(蜂窩)基站放置在一塊兒,以免額外的基礎設施建設開銷。然而,肯定邊緣服務器的準確位置是一個重要的問題,其解決方案必須是最優放置。本文中,咱們給出Anveshak框架,該框架在地理拓撲中解決邊緣服務器放置問題,併爲邊緣提供商提供最優方案。咱們提出的方案同時考慮了終端用戶應用需求和邊緣平臺提供商的部署和運維成本。Anveshak的放置優化度量標準考慮了用戶的請求模式和現有的用戶創建的邊緣服務器。基於真實數據集的評估代表Anveshak在保持較高服務器利用率的同時取得67%的用戶滿意度增加。服務器

1 INTRODUCTION (引言)

Novel applications, such as the Internet of Things (IoT) and augmented and virtual reality, have exponentially increased the amount of data generated and transported over the network. To mitigate the response time and handle large-scale data analysis closer to the users and data generators, the researchers have proposed edge clouds. As the name suggests, edge cloud is a consolidation of compute servers deployed very close to end user with limited compute, storage and network capability [1, 12, 22]. The central objective of edge clouds is to ensure low network delays for latency-critical applications such as autonomous driving, drones, augmented reality, etc. [10]. Such a requirement can be fulflled by exploiting the physical proximity between the edge server and the client. 網絡

新型應用(如物聯網IoT,加強和虛擬現實)致使數據產生量和網絡傳輸的數據量指數級增加。爲了減小響應時間,而且在接近用戶和數據產生到處理大規模數據分析,研究人員提出了邊緣雲。正如名字所暗示的,邊緣雲是靠近終端用戶的計算服務器集合,這些服務器具備有限的計算、存儲和網絡能力[1, 12, 22]。邊緣雲的中心目標是確保延遲關鍵型應用的低網絡延遲(如自動駕駛、無人飛機、加強現實等[10])。這一需求能夠經過利用邊緣服務器和客戶端的位置接近優點得以知足。架構

Existing studies focus on optimal utilization of the edge server by end-user requests, assuming that the server has been placed already [2, 18]. Little to no attention has been paid to model the edge server deployment problem along with its placement constraints. There are similarities between the edge server placement problem and replica server deployment problem in CDNs, for which several solutions exist in the literature [14, 15, 17]. Akin to CDN cache servers placement problem, edge server placement must also ensure consistent connectivity to end users while minimizing the cost of such a deployment. However, we argue that despite similarities in their objectives, the two placement problems are essentially quite different. Unlike replica servers, an edge server is more likely to cater to several compute requests of local relevance which does not require high volume data transfer over the network. In such a case, the availability and network latency associated with an edge server have greater priority over link usage and network bandwidth.app

當前的研究關注經過終端用戶請求優化邊緣服務器的利用率,假設服務器已經部署完成[2, 18]。幾乎沒有研究關注建模邊緣服務器部署問題以及放置約束條件。邊緣服務器放置問題和CDN(內容分發網絡)中的複製服務器部署問題有相似性; CDN(內容分發網絡)中的複製服務器部署問題已經有多重解決方案,見文獻[14, 15, 17]。相似於CDN緩存服務器部署問題,邊緣服務器放置一樣必須確保和終端用戶的持續鏈接,同時最小化部署開銷。然而,咱們認爲儘管二者目標有類似性,這兩個放置問題在本質上是不一樣的。不一樣於複製服務器,邊緣服務器是面向多個局部相關的計算請求,這些請求不須要大量經網絡傳輸的數據。在這種情形下,與邊緣服務器相關的可用性和網絡延遲比鏈路使用量和網絡帶寬具備更高的優先級。框架

Several options for deploying edge servers have been proposed in the literature. Mobile Edge Clouds (MEC’s), defined by European Telecommunications Standards Institute (ETSI), aim to co-locate edge servers with cellular base stations set up by telecom providers operating in the area [1]. On the other hand, researchers have also proposed to utilize non-conventional compute resources, such as WiFi access points, smart speakers, network switches, etc., to support computation capability at the network edge [9]. Unlike MEC, these resources are owned and managed by end-users. Even though the proposed models differ in deployment requirements, management, capacities, etc.; we envision that the models are independent of the protocols, software stacks and user applications that will drive the edge cloud platform as a whole. 

文獻中提出多種部署邊緣服務器的可能方案。移動邊緣雲(MEC)致力於將邊緣服務器和蜂窩基站(由負責該地區運營的電信提供商創建)放置在一處[1],這一律念由歐洲電信標準化組織(ETSI)定義。另外一方面,研究人員提出利用非傳統計算資源(如WiFi訪問點、智能音箱、網絡交換機等)在網絡邊緣提供計算能力[9]。與MEC不一樣,這些資源由終端用戶所擁有和管理。儘管提出的模型在部署需求、管理和能力等方面不一樣;咱們預見這些模型獨立於協議、軟件棧和用戶應用,促使邊緣雲平臺成爲一個總體。 

In this paper, we present Anveshak, a deployment framework which enables edge service providers to select optimal sites for edge server placement. Our contributions are as follows. (1) Anveshak considers the availability and density of unmanaged edge resources in the area to steer the deployment location of a managed server. The novelty lies in predicting future deployments of user-owned edge servers and incorporating it in current edge server deployment problem. (2) We identify the areas of higher preference for deployment by observing the mobility pattern of the users in the area. We consider previous requests issued by the users to prioritize locations with a higher probability of edge service requests, thereby optimizing user satisfaction. (3) We extensively simulate Anveshak on real-world datasets collected over the city of Milan, Italy. Our evaluation shows that Anveshak increases the user request satisfaction by 67% while maintaining an average server utilization of 83%. To the best of our knowledge, there exist no previously known studies which consider server provisioning in a scenario where multiple edge cloud models coexist and operate in the same physical space. 

本文中,咱們展現了Anveshak,一種使得邊緣服務提供商能夠爲邊緣服務器放置選擇最優站點的部署框架。咱們的貢獻以下:(1)Anveshak考慮本地未管理邊緣資源的可用性和密度來引導受管理服務器的部署位置。創新性在於預測用戶邊緣服務器的將來部署,並集成到現有邊緣服務器的部署問題中。(2)咱們肯定具備高優先權的部署位置,這是經過觀察當地用戶的移動模式。咱們認爲高優先位置用戶提交的前期請求具有更高的服務請求可能性,所以優化用戶滿意度。(3)咱們在真實數據集上深度仿真了Anveshak,數據集收集於意大利的米蘭市。咱們的評估代表Anveshak在保持83%的平均服務器利用率的同時將用戶請求知足度提升了67%。就咱們所知,沒有前期工做在多個邊緣雲模型共存並在同一物理位置運維的場景下考慮服務器供給問題。

The rest of the paper is organized as follows. Section 2 discusses the physical edge cloud abstraction composing of multiple edge cloud models in same space. Section 3 provides an in-depth description of model, framework design and optimization problem of Anveshak. We implement Anveshak and evaluate its performance on real datasets in Section 4. Section 5 reviews the related work. We conclude our paper in Section 6. 

本文的剩餘部分組織以下:第二部分討論在同一空間由多個邊緣雲模型組成的物理邊緣雲抽象;第三部分給出Anveshak模型、框架設計和優化問題的深度描述;第四部分描述Anveshak的實現和真實數據集上的性能評估;第5部分回顧相關工做;第6部分總結全文。

2 PHYSICAL EDGE CLOUD NETWORK (物理邊緣雲網絡)  

Researchers have proposed several edge cloud architectures to support the use-cases present in real world [2]. Mobile Edge Computing (MEC) is a telecommunication-vendor centric edge cloud model wherein the deployment, operation, and maintenance of edge servers is handled by an ISP operating in the area [10]. The model has garnered interest from standardization bodies [6]. On the other hand, researchers have proposed a user-centric view where a user can deploy computationally-capable network devices local to their surroundings. The proliferation of smart speakers, home automation hubs, intelligent wireless access points provides evidence to the adoption of such edge architectures [16]. Unlike the MEC resources, the user-centric edge resources are self-managing in nature and are less likely to have consistent network and computational availability. 

研究人員已經提出多種邊緣雲架構,以支持真實世界的用戶案例[2]。移動邊緣計算(MEC)電信商爲中心的邊緣雲模型;其中,邊緣服務器的部署、操做和維護由本地ISP處理[10]。這種模型得到標準化組織的興趣[6]。另外一方面,研究人員也提出了用戶爲中心的觀點;其中,用戶能夠在他們附近部署具備計算能力的網絡設備。智能音箱、家庭自動化集線器和智能無線訪問點等的普及爲這種邊緣架構的採用提供了證據[16]。不一樣於MEC資源,用戶爲中心的邊緣資源是自然自管理的,而且具備更低可能來提供持續的網絡和計算可用性。

Both above models consider different deployment options from in-network placement at aggregation level to opportunistic consolidation composed of small compute hubs. However, we consider a holistic view of a physical space where several edge servers belonging to different cloud models and technologies coexist. As each model brings in its advantages and drawbacks, the coexistence and cooperation between available edge servers will be critical to efficient computation and context availability in future. Figure 1 shows the physical abstraction of edge servers and users coexisting in a geographical area. The model is a two-tier hierarchy of edge servers in a physical space alongside with users, the details of which are explained below. 

圖1: 可能的位於同一地理區域的邊緣服務器和用戶抽象。

上述兩種模型考慮不一樣的部署選項,從聚合級的網內放置到小計算集線器組成的機會合並。然而,咱們考慮物理空間的全局視圖;其中,多個屬於不一樣雲模型和技術的邊緣服務器共存。因爲每一個模型具備本身的優劣勢,可用邊緣服務器的共存和協做是高效計算和將來上下文可用性的關鍵。圖1展現了邊緣服務器和用戶在同一地理區域共存的物理抽象。該模型是兩層層次化的,由物理空間的邊緣服務器和用戶組成;細節在下面的章節解釋。

Users: The subscribers of edge cloud in a region act as the source for all compute requests. Previous research in user mobility has shown that user request distribution in any area is temporally and behaviorally influenced [11]. For example, user request density is more populated in city centers than suburban areas. Such request patterns profoundly affect the utilization of edge server deployments in any region. An efficient server deployment algorithm must consider the origin and pattern of user requests in a geographical region to allocate server resources for optimal utilization and availability. 

用戶:某一區域內邊緣雲的訂閱者,做爲全部計算請求的源。關於用戶移動性的前期研究代表任意區域的用戶請求分佈是受時間和行爲影響的[11]。例如,城市中心的用戶請求密度遠高於郊區。這種請求模式極大影響任意區域邊緣服務器部署的利用率。一種高效的服務器部署算法必須考慮同一地理區域內用戶請求的來源和模式,從而爲最優化利用率和可用性分配服務器資源。

User-managed Edge: This layer is composed of edge servers which are managed by individual entities for local usage and are likely to be deployed in households, small workplaces, etc. These servers utilize WiFi (short-range) networks to interact with enduser. The user-managed edge servers are responsible for handling computational requests from a small set of clients and are thus limited in computation power. However, they provide a very local context to user-generated request. The availability of such servers is highly dependent on user residency and mobility itself. For example, densely-populated residential areas and tourist attraction spots have a higher availability of WiFi access points than in industrial/office areas [19].

用戶控制的邊緣:這一層由獨立個體控制的邊緣服務器組成,用於本地使用,而且可能部署於家庭、小型工做場所等。這些服務器使用WiFi(短距離)網絡與終端用戶交互。用戶控制的邊緣服務器負責處理來自少數客戶端的計算請求,所以計算能力有限。然而,它們爲用戶請求提供本地上下文。這種服務器的可用性高度依賴於用戶住所和移動性自身。例如,高密度居住區域和著名景區的WiFi訪問點的可用性比工廠/辦公區域的高的多。

Service Provider-managed Edge: The top-most layer of the edge server abstraction model is composed of service-provider managed edge servers. Such servers are co-located with cellular base stations set up in the region due to their strategic locations and constant ISP management. An edge server physically co-located at the base station signifcantly reduces the operation and maintenance costs involved for specifcally setting up a location to house a server. Unlike user-managed edge, the edge servers managed by a third-party service provider have a higher computational capability and wider-area coverage. These edge servers utilize the network fabric and capability offered by the cellular base station to connect with users and amongst themselves. ISPs can also remotely manage and maintain these edge servers by utilizing their existing set up infrastructure. 

服務提供商控制的邊緣:邊緣服務器抽象的最上層由服務提供商控制的邊緣服務器組成。這些服務器和這一區域創建的蜂窩基站的位置相同;這是由於他們的策略性位置和恆久性ISP管理。邊緣服務器在物理上和基站的位置相同能夠顯著地下降操做和維護開銷(特別是創建容納一臺服務器的位置)。與用戶控制的邊緣不一樣,第三方服務提供商控制的邊緣服務器的計算能力更強,且區域覆蓋範圍更廣。這些邊緣服務器使用蜂窩基站提供的網絡設施和能力鏈接用戶,以及彼此互連。使用現有的基礎設施,ISP能夠遠程管理和維護這些邊緣服務器。

3 ANVESHAK: MODEL AND DESIGN (Anveshak:模型和設計)

The problem of deploying edge servers in a physical space boils down to ensuring low latency, proximity and high availability to clients. Further, installing a server at any location incurs a combination of CAPEX (purchase and deployment) and OPEX (maintenance, security) costs to the service provider. To maximize the profits, it is in best interests of the provider to select edge sites intelligently such that the deployed server has maximum impact and utilization. Anveshak enables service providers to find optimal edge sites in a large metropolitan area. It selects a prioritized list of cellular base stations within an area which can be augmented with an edge server. Through its insightful utilization of pre-existing user request patterns and edge servers in the area, Anveshak ensures that the selected edge site has maximum reachability and high user request satisfaction.

在物理空間部署邊緣服務器的問題歸結爲保證到客戶端的低延遲、位置接近和高可用性。此外,在任意位置部署服務器致使服務提供商CAPEX(購買和部署)和OPEX(維護、安全)組成成本開銷。爲了最大化收益,提供商智能選擇邊緣位置,使得部署的服務器具備最大的影響和利用率。Anveshak使得服務器提供商能夠在大的城區區域內尋找最優邊緣位置。它在某一區域內選擇具備高優先權的蜂窩基站列表,這些蜂窩基站能夠經過部署邊緣服務器獲得加強。經過本區域預存在的用戶請求模式和邊緣服務器利用率信息,Anveshak保證選擇的邊緣站點具備最大的可達性和用戶請求知足度。

Figure 2 shows the workflow of Anveshak. We categorize the framework’s functioning in three phases, user mapping, user edge incorporation and edge location selection.

圖2展現了Anveshak的工做流。咱們將框架的功能分爲3個階段:用戶映射、用戶邊緣集成和邊緣位置選擇。

圖2:Anveshak的總工做流。

Phase 1: User Mapping to Physical Space (階段1: 用戶映射到物理空間)

The design of Anveshak is based on the assumption that the edge service provider works in conjunction with the ISP to ensure optimal installation of edge servers on ISP-managed base stations. Therefore, the service provider will have access to the user request database from ISPs operating in the region. These request databases can include Call Detail Records (CDR), message requests, internet usages etc. which can be augmented to form user request pattern. Anveshak utilizes the dataset of communication requests by the clients of the ISP in its frst phase. The objective of the framework in this phase is to identify areas of high communication requests in the geographical region as these areas have a higher probability of receiving edge compute requests.

Anveshak的設計基於這一假設:邊緣服務提供商與ISP協同工做來確保邊緣服務器在ISP控制的基站的最優安置。所以,服務提供商須要訪問本地ISP的用戶請求數據庫。這些請求數據庫可能包含呼叫詳細記錄(CDR)、消息請求和因特網使用量等,其能夠經過加強造成用戶請求模式。Anveshak使用第一階段內的ISP的客戶端的通訊請求數據集。框架在這一階段的目標是識別本地理區域內具備高通訊請求的區域,由於這些區域具備更高的接收邊緣計算請求的可能。

Anveshak begins the phase by dividing the space S into evenly spaced square grids(shown in Figure 3a). Further, Anveshak maps the user communication request originating from a location in S, as shown in Figure 3b. The user requests are normalized and averaged over a time duration of one to several months such that temporal outliers in the dataset (user gatherings, fairs, concerts etc.) are ironed out. Once the framework has all user requests mapped to point set P in S, it clusters them based on inter-request distances and density. The clustering algorithm identifes regions with dense and frequent user requests in S by selecting a minimum number of request points within MinPts radii of an existing base station in that area. Further, the algorithm also specifes ϵ which defines the minimun required distance between two points to classify them as part of a single cluster. Figure 3c presents the user requests clustered together in S. The choice of ϵ and MinPts is key to efficient clustering in Anveshak and can be easily adjusted by the service provider to best suit deployment requirements. 

Anveshak以將空間S劃分爲均等的二維空間網格開始(圖3a所示)。此外,Anveshak映射來自S中某一位置的用戶通訊請求,如圖3b所示。用戶請求在從1到多個月的時間跨度內規格化和平均化,消除數據集(用戶彙集,集市,音樂會等)中的時間異常值。一旦框架將用戶請求映射到S中的點集P,依據請求間的距離和密度聚合它們。聚合算法識別S中區域的密度和頻繁用戶請求,這是經過選擇本區域內現有基站的MinPts半徑內的請求點的最小數字。此外,該算法指定ϵ參數,其定義兩點間的最小要求距離,以將它們分類爲單個聚合的一部分。圖3c給出S中用戶請求的聚類。ϵ的選擇和MinPts的選擇是Anveshak高效聚類的關鍵,而且能夠由服務提供商輕易的改變以最佳適配部署要求。

圖3: Anveshak的階段1。

Following request clustering, Anveshak maps arbitrary cluster shapes to the corresponding grids in S (shown in Figure 3d). The density of a cluster is normalized to generate grid-based heatmap of the region. In doing so, Anveshak can handle overlapping clusters, small dense clusters, and clusters of various shapes more efficiently than related approaches. Furthermore, this enables the framework to overcome the inefciencies of the clustering algorithm used.

請求聚類以後,Anveshak映射任意聚類形狀到S中相應的網格(圖3d所示)。聚類的密度規格化爲生成本區域基於網格的熱度圖。爲了實現這一目標,Anveshak能夠比相關方法更高效地處理交疊聚類、低密度聚類和不一樣形狀的聚類。此外,這使得框架能夠克服使用的聚類算法的低效性。

The density heatmap and its location coordinates is fed into the next phase of Anveshak. 

 熱度圖的密度和位置座標輸入Anveshak的下一個階段。

Phase 2: User Edge Incorporation (用戶邊緣合併)

As discussed in Section 2, compute-capable network devices such as smart speakers, home automation, smart WiFi routers, etc. have become quite popular and are expected to develop into $4.2 billion market by 2022 [16]. Such smart devices can resolve relatively small computations locally to the clients and will be preferred by users over a service provider-managed server in the same location. The availability of these devices signifcantly impacts the utilization of deployed edge server in an area as the number of user requests which will be offloaded to a managed edge server will notably reduce. In its second phase, Anveshak integrates the current and future availability of user-managed edge servers by the end users. It does so by building on the assumption that areas with high density of WiFi access points (APs) are more likely to have a future deployment of user-managed edge servers. The inclusion of this phase is key to the novelty of Anveshak over related works. 

如第2部分討論的,具有計算能力的網絡設備(如智能音箱、家居自動化、智能WiFi路由等)已經變得至關普及,其到2022年市場指望規模將達到42億美圓[16]。這些智能設備能夠分解局部於客戶端的小的計算單元,且在同一位置用戶更傾向於使用這些智能設備(相比於服務提供商管理的服務器)。這些設備的可用性顯著影響某一區域部署的邊緣服務器的利用率,由於用戶請求的數量(將被卸載到管理的邊緣服務器)將顯著下降。在第二階段,Anveshak將集成當前和將來的終端用戶管理的邊緣服務器的可用性。這經過創建下述假設實現:具有高WiFi訪問點(APs)密度的區域具備更高的將來用戶管理邊緣服務器部署的可能。本階段的內涵是Anveshak相較於其它相關工做的創新所在。

Anveshak merges all user requests from grid Gi ∈ S such that user requests in the same cluster C are distributed over several grid groups. It further exploits already existing datasets of WiFi APs in space S (such as Wigle [19]) and maps them on the same grid Gi. Based on the density of existing deployment, Anveshak revises the user request heatmap of S where grids with denser WiFi availability receives negative request density adjustment. The resulting map prioritizes locations with a lower number of local edge deployments as the probability of clients to request a provider-managed edge server is higher. The grid locations GL is fed into the next phase of Anveshak. 

Anveshak歸併來自網格Gi ∈ S的全部用戶請求,使得同一集羣C內的用戶請求在多個網格組中分佈。它進一步利用空間S中的現有WiFi AP數據集,並將它們映射到同一網格Gi。基於當前的部署密度,Anveshak修訂S的用戶請求熱度圖;具備更高WiFi可用性密度的網格收到負的請求密度調整。最終的映射給予具備更少本地邊緣部署的位置高優先權,由於客戶端請求提供商管理的邊緣服務器的機率更高。網格位置GL輸入到Anveshak的下一階段。

Phase 3: Edge Location Selection (邊緣位置選擇)

In its fnal phase, Anveshak increasingly orders grids from Phase 2 on ratio of user request density. The set of users U within a grid Gi can be served by x possible edge locations (existing base stations) denoted by LGi = l1, . . . ,lx. Anveshak ensures one-hop connectivity between users and deployed server by selecting a location lk which is best reachable for majority of users in Gi. Let Rmax (u,Sl ) be the maximum tolerated network distance between U and Sl where l ∈ LGi . 

最後一個階段,Anveshak根據第二階段中用戶請求密度率遞增排序網格。網格Gi中的用戶集合U能夠由x個可能的邊緣位置(現有基站)提供服務,描述爲LGi = l1, . . . ,lx。Anveshak保證用戶和部署服務器之間的1跳鏈接,經過選擇位置lk(對Gi中絕大部分用戶具備最好的可達性)。以Rmax (u,Sl ) 表示U和Sl間最大容忍網絡距離,這裏 l ∈ LGi

Based on the requirements and number of servers to be placed in S, Rmax of Sl will specify the cluster boundary for satisfying u and is influenced by the connectivity range of the existing base station. Further, let α denote the maximum cost incurred to access the server Sl by users in the cluster. Thus, the network cost (n(S,u)) of a cluster can denoted as 

基於請求和放置到S中的服務器數量,Sl的Rmax指明知足u的集羣邊界,並受到現有基站的鏈接範圍的影響。此外,以α表示集羣中用戶訪問服務器Sl的最大開銷。由於,集羣的網絡開銷表示爲:

In order to estimate the network costs between users and server location within a grid, the model utilizes a coordinate based network latency approximation technique [13]. Anveshak attempts to minimize the latency to one-hop between majority of users and deployed server Sl within grid Gi based on Equation 1. Further, the users in same grid which do not fall under direct connectivity of Sl are reachable within 2-3 hops by utilizing the internal network between base stations. 

爲了評估網格內用戶和服務器位置的網絡開銷,模型使用基於座標的網絡延遲逼近技術[13]。Anveshak試圖最小化網格Gi中大多數用戶和已部署服務器Sl之間的延遲到1跳(基於公式1)。此外,同一網格內的在Sl中不具有直接鏈接的用戶在2到3跳內可達(經過使用基站間的內部網絡)。

Let xl denote a binary decision variable which is 1 if we locate Sl in candidate location l ∈ LGi . Therefore, the optimal server location for an arbitrary user u can be defned as,

以xl表示二進制變量,若是咱們將Sl放置在候選位置l∈ LGi),則其值爲1。所以,任意用戶u的最優服務器位置定義爲:

The equation 3 is a variant of facility location problem (FLP) [7] with network capacity constraints. The resulting optimization is a well known NP-hard problem, the approximate solution of which can only be obtained by adding specifc placement constraints. However, since Anveshak divides S in small grids with limited number of edge site locations, even the worst-case iterative solution for optimizing Equation 3 takes reasonable time. 

公式3是網絡容量制約下設施選址問題(FLP)[7]的變種。最終的優化問題是衆所周知的NP難問題,其近似解只能經過增長特定的放置約束得到。然而,因爲Anveshak將S劃分爲具備有限邊緣站點位置的小網格,即便是最壞情形下的優化公式3的迭代解決方案只消耗合理的時間。

4 EVALUATION METHODOLOGY (評測方法)

We now evaluate the efficiency of Anveshak in placing edge servers over Milan, Italy by utilizing several open datasets. We frst implement Anveshak’s workflow (shown in Figure 2) as two separate pluggable modules. The Phase 1 of the framework is implemented as clustering module in R. The module produces clusters of user requests based on request patterns, WiFi access points, and base stations datasets provided to it. We design the module to be independent of the choice of clustering algorithm used and can be freely selected by the service provider (default is DBScan). Phase 2 and 3 of Anveshak are implemented in Python and return base station coordinates to the service provider considering the constraints imposed. 

咱們如今評估Anveshak在乎大利米蘭放置邊緣服務器的效率,該評測使用多個開放數據集。咱們首先實現Anveshak的工做流(如圖2所示)爲兩個分立的可插拔模塊。框架的第1階段實現爲R中的聚類模塊。該模塊基於請求模式、WiFi訪問點和提供的基站數據集產生用戶請求的聚類。咱們設計該模塊使其獨立於聚類算法的選擇(服務提供商可使用和自由選擇,默認爲DBScan)。Anveshak第二階段和第三階段使用python實現,並返回考慮制約條件的基站座標給服務提供商。

We compare Anveshak with two alternative placement approaches which have been discussed in the literature [14, 17]. The approaches are described as follows: 

咱們比較了Anveshak和兩種可選的放置方法(在文獻[14, 17]中討論)。這兩種方法描述以下:

  • Greedy: This method allocates average user request densities to the base stations in the area of interest. It then utilizes a greedy selection algorithm to select top-k base stations which serve most number of users in the area. 
  • 貪心:該方法分配平均用戶請求密度給興趣區域的基站。而後,它使用貪心選擇算法選擇服務本區域大部分用戶數量的前k個基站。
  • Random: As its name suggests, this approach randomly chooses k base stations on the map and assigns edge servers to them. 
  • 隨機:如名字所暗示的,這種方法隨機選擇地圖中k個基站,並將邊緣服務器分配給它們。

Unlike Anveshak, both of the approaches mentioned above neither consider whether selected base stations serve the same set of users due to connectivity overlap nor the availability of other edge servers in the area. 

與Anveshak不一樣,上述提到的這兩種方法既不考慮所選擇的基站是否服務相同的用戶集合(因爲鏈接交疊)也不考慮區域內其它邊緣服務器的可用性。

 4.1 Dataset (數據集)

In order to gauge the impact of selection algorithm on real networks, we utilize several open datasets over city of Milan, Italy. For user connectivity requests, we use the dataset published by Telecom Italia from November 1st to December 31st 2013. The anonymized dataset divides the map of Milan into 100x100 grids of 250m width. The dataset contains user’s internet connection to base station as user request tied to its grid ID along with the time when it was made. In our evaluation, Anveshak utilizes the average total user requests in November, 2013 to generate clusters of user requests. The heatmap of unclustered user internet requests for November is shown in Figure 4a. 

爲了測量選擇算法在真實網絡上的影響,咱們使用意大利米蘭市的多個開放數據集。對於用戶鏈接請求,咱們採用意大利電信發佈的2013年11月1日到12月31日的數據集。該匿名數據集將米蘭地圖劃分爲250米寬的100x100網格。數據集包含用戶到基站的因特網鏈接(用戶請求關聯其網格ID,以及請求的時間)。評估中,Anveshak使用2013年11月份的平均總用戶請求生成用戶請求集羣。11月份未聚類的用戶因特網請求熱度圖如圖4a所示。

圖4: 規格化用戶通訊請求、WiFi訪問點和意大利電信蜂窩基站的分佈(意大利米蘭市)。

We map all WiFi access points in the same area of that of Telecom Italia dataset by utilizing open crowd-sourced dataset from Wigle [19]. The dataset contains SSID, location coordinates, signal strength, channel number etc. for all access points. Out of the entire dataset, we flter out the hotspot access points to reduce variations in access point location density. Figure 4b shows the density heatmap of WiFi access points in Milan. We utilize an open dataset of all cellular base stations in the world and use the ones in Milan using the coordinates provided in the dataset. We further filter and use close to 800+ Telecom Italia base stations in Milan in our evaluation. The heatmap of Telecom Italia base stations in Milan is shown in Figure 4c.

咱們映射意大利電信數據集中的同一區域的WiFi訪問點,使用來自Wigle[19]的開發人羣源數據集。該數據集包含全部訪問點的SSID,位置座標,信號強度,通道數量等。咱們從整個數據集中過濾出熱點訪問點,以減小訪問點位置密度的變化。圖4b給出米蘭WiFi訪問點的密度熱度圖。咱們使用全球全部蜂窩基站的開放數據集,並使用數據集中提供的座標採用米蘭的蜂窩基站。咱們進一步過濾並使用接近800+個米蘭的意大利電信基站。意大利電信的米蘭基站熱度圖如圖4c所示。

As assumed in design of Anveshak, we can observe from Figure 4 that both user requests and WiFi access points are concentrated in populated areas of the city (such as city center) whereas the cellular base stations are evenly distributed throughout the map. 

如Anveshak設計中的假設,咱們從圖4中觀察到:用戶請求和WiFi訪問點彙集於城市的彙集區域(如城市中心),而蜂窩基站在真個地圖中均勻分佈。

4.2 Results (結果)

We now evaluate the placement efciency of the discussed approaches. We task the placement algorithms to select 50 out of total 812 base stations in Milan as edge server deployment sites. The average coverage radius of base station in the dataset is little higher than 1000m; we utilize a coordinate based latency approximation [13] to estimate user requests which can be satisfed within this area. These requests may originate from neighboring grids of the selected base station. Further, Anveshak utilizes users internet traffc requests for November 2013 for initial clustering and edge site selection. We then evaluate the efciency of the selection for user requests in December 2013. 

如今,咱們評估上述討論的放置方法的效率。咱們的放置算法是從總共812個米蘭基站中選擇50個做爲邊緣服務器的部署站點。數據集中基站的平均覆蓋基站稍大於1000米;咱們使用基於座標的延遲近似[13]來評估本區域可知足的用戶請求。這些請求可能源自選擇基站的鄰接網格。此外,Anveshak使用2013年11月用戶因特網流量請求初始化聚類和邊緣站點選擇。而後,咱們評估2013年12月用戶請求的選擇效率。

We focus our evaluation and comparison on two metrics: (1) the percentage of user requests satisfed by selected edge site, and (2) the total utilization of the deployed edge server. All our results are averaged over ten runs. 

咱們關注和比較兩種指標:(1)選擇站點知足的用戶請求百分比,和(2)部署的邊緣服務器的總利用率。全部的結果是10次運行的平均值。 

User Request Satisfaction: Figure 5 compares the percentage of user requests which were satisfed by the base stations selected by each approach for every third day in December. As observed from the figure, edge servers deployed via Anveshak can serve ≈ 67% more users than Greedy in an area. We attribute this behavior to greedy selection of sites based on user requests inherent to Greedy functioning. Even though the site selection by Greedy prefers highest serving base stations, it often fails to consider locations which are far away from densely populated areas yet having signifcant request origination. On careful analysis, we found that unlike Anveshak which satisfes all clusters on the map, Greedy favors base stations within most dense user cluster. 

用戶請求知足度:圖5比較了每種方法選擇的基站知足的用戶請求的百分比(12月的每3天)。由圖可知,Anveshak方案部署的邊緣服務器能夠比貪心算法服務更多的用戶,約高67%。咱們將這一行爲歸因於貪心選擇的站點基於用戶請求固有的貪心功能。即便貪心站點選擇傾向於最高級服務基站,也一般沒法考慮遠離彙集區域的位置(但有顯著的請求根源)。經過仔細分析,咱們發現不一樣於Anveshak知足地圖中全部集羣的行爲,貪心傾向於具備最高用戶集羣密度的基站。

圖5:規格化用戶請求滿意度

From the results, we also see that Anveshak satisfes ≈ 25% of total user requests on average by selecting 8% of total base stations. In our further experiments, we found that Anveshak achieves more than 90% user satisfaction by installing just 124 edge servers (on average). Whereas Greedy and Random require 218 and 300+ servers respectively. We do not show the detailed results due to space limitations. 

從結果中,咱們還能夠發現,Anveshak經過選擇8%的基站能夠平均知足約25%的用戶請求。在咱們進一步的實驗中,咱們發現Anveshak取得高達90%的用戶滿意度,經過安裝124個邊緣服務器(平均)。然而,貪心和隨機算法分別要求218和300+服務器。因爲篇幅限制,咱們再也不展現詳細的結果。

Server Utilization: We deploy edge servers on all selected locations where a server can handle up to 500 user requests every 10 minutes. Further, we augment 10% WiFi APs in the coverage area as compute-capable and a single AP handle 50 requests/10mins within the grid thereby operating at 10% compute power of that of a managed edge server. As discussed in Section 2 user-managed edge server first handles all user requests upon exceeding which it is sent to base station edge server. If the base stations receive more requests than its capacity in 10 minutes, it offloads additional requests to the remote cloud. Figure 6 shows overall server utilization in December 2013. 

服務器利用率:咱們將邊緣服務器部署到全部選擇的位置;這裏,一臺服務器每分鐘能夠處理高達500個用戶請求。此外,咱們加強覆蓋區域內10%的WiFi AP做爲具有計算能力的設備,而且網格內單個AP能夠處理50個請求/10分鐘,所以其具有控制的邊緣服務器的10%的計算能力。如第二部分討論的,用戶控制的邊緣服務器首先處理全部的用戶請求,超過其處理能力時,將用戶請求發送到基站邊緣服務器。若是基站在10分鐘內接收到超過其處理容量的請求,其將請求卸載到遠程雲。圖6給出2013年12月總體服務器利用率。

圖6: 平均服務器利用率。

Anveshak achieves 83% server utilization on average whereas Greedy and Random achieve 66% and 12% utilization only. We attribute the reason for such high utilization by Anveshak to its selection of edge servers with less availability of user-managed edge servers. The sites selected by Greedy have a high concentration of WiFi APs which leads to lesser requests sent to the managed server. 

Anveshak平均取得83%的服務器利用率,而貪心和隨機算法只取得66%和12%的平均利用率。咱們將Anveshak的如此高的利用率的緣由歸由於其選擇的邊緣服務器具備更少用戶控制邊緣服務器的可用性。貪心算法選擇的站點的WiFi AP的集中性高,致使更少的請求被髮送到控制的服務器。

相關文章
相關標籤/搜索