深層解析：構建facebook應用商店推薦引擎

時間 2019-12-13

標籤深層解析構建應用商店推薦引擎欄目硅谷简体版

原文原文鏈接

Under the Hood: Building the App Center recommendation engine

As more apps on Facebook Platform have launched over the years, the types of apps available have become more diverse, making it crucial that people see the most relevant and highest quality apps in channels like news feed and App Center. 前端

近幾年，隨着愈來愈多的app在facebook應用商店上發佈，app的類型愈來愈多樣化，用戶經過像news feed（周邊朋友圈動態）和app center這樣的渠道來發現高相關和高質量的應用，這是很是重要的。node

While news feed has always functioned as a recommendation engine, the App Center is the latest way for people to discover apps, and it's increasingly becoming a prominent channel for developers to distribute their apps. On average, 220 million people visit the App Center each month, and those visitors are 40% more likely to return the next day.react

雖然news feed老是被定義爲一個推薦引擎，可是app center是發現app的最新方式。而且，它日益成爲開發者發佈他們的app的主導型平臺。平均而言，每月有2.2億獨立用戶訪問app center，其中40%的用戶會在次日再次光顧。web

We built the App Center to give the growing audience of app users a central place on Facebook to browse apps. However, given the multitude of apps that use Facebook, recommending the right apps to the right people is a tough challenge. We needed to build a system that could handle large-scale data and traffic, respond quickly, and incorporate user feedback in realtime.算法

咱們創建了應用中心讓愈來愈多的apps用戶在facebook的中心位置瀏覽apps，然而，考慮到大量的用戶和apps，推薦正確的apps到正確的用戶是艱難的挑戰。咱們須要構建一個系統，它能夠應對大規模的數據，請求，快速響應，並實時根據用戶反饋來調整推薦結果。服務器

The goal is for curation of the App Center to be driven by quality and personalization, instead of editorialization. Just as with news feed, personalization in App Center will improve over time as people and their friends engage with more apps. 網絡

咱們的目標是以質量和個性化來驅動，而不是帶有主觀意識的編輯驅動。正如news feed，應用商店的個性化會隨着用戶和用戶的朋友產生更多的瀏覽，下載app等行爲增多，推薦效果將會提高。app

Building a recommendation enginefrontend

To efficiently solve this problem, we built a recommendation engine directly into App Center, so that, just as with news feed, each person would have a personalized experience. The recommendation engine powers the App Center and helps it learn people’s preferences in order to serve them with app recommendations that are timely, socially relevant, and unique to them. This allows a more diverse set of apps to become discoverable, particularly those in harder to find or up-and-coming categories. dom

爲了有效地解決這個問題，咱們直接在應用商店創建了推薦引擎，就像news feed，將會給每一個人帶來個性化體驗。推薦引擎支撐着應用商店，而且幫助應用商店學習用戶的偏好，進而提供更及時的，社交相關的，獨一無二的服務體驗。這樣，使得多樣的app更容易被發現，尤爲是那些難以發現尾部應用和剛剛展露頭角優質的新應用。

The system follows an aggregator-leaf architecture—very similar to that of a search engine. Because we have a lot of data, it is necessary to partition the objects into multiple subsets (shards) where each leaf node is only responsible for one subset. The aggregator acts as a central controller, receiving the recommendation request from the front end web server and distributing to leaf nodes. Each leaf node then finds a set of best candidates from the objects stored on the local machine and returns them to the aggregator. The aggregator then performs a final merge and returns the best results to the client.

這個App推薦系統由收集葉節點(aggregator-leaf)概念構成，跟搜索引擎類似，它處理大量資料，並將物件分紅各類子集，讓每個葉節點只負責一個子集，而收集器扮演中央控制角色，接收前端網絡服務器的推薦請求(recommendation request)，再分佈到各個葉節點。每個葉節點會從本機找到一組最適合的推薦候選，再回傳給收集器，收集器整合以後將結果顯示給客戶端。

After that, the frontend collects user feedback, which is then integrated into the app recommendation engine. We scale this system in two ways: The first is to increase the number of shards so that we can handle more data. The second way is to have multiple replicas so that we can handle more traffic. Using replicas also adds redundancy to the system, which allows us to tolerate the failure of some machines.

接着前端蒐集使用者反饋，再整合進App推薦引擎。咱們經過2種方式擴張：1）增長shards來處理更多的數據；2）增長更多副原本響應更多的請求。增長副本須要增長redundancy，容許用戶忍受針對某些機器請求的失敗。

Determining high quality

Growth in the App Center is tied to quality, and we determine that quality based on user ratings and positive/negative user signals for an app over time.

應用商店的發展壯大離不開app的質量，而質量的又由用戶評分和用戶積極和消極的信號來決定。

In order to accurately measure quality, we developed a system that randomly surveys the user to rate an app shortly after we detect that the user has used the app. Then, when we compute the average rating for an app, we include a confidence adjustment based on the number of ratings the app has received.

爲了準確的評估應用的質量，咱們開發了一個系統來隨機的對剛剛用過這個app的用戶進行調查，讓用戶給應用打分。而後，咱們在考慮打分次數基礎上（好比威爾遜區間）計算應用的平均得分。

We found that the number of daily active users (i.e. the average number of users who used the app in a day) was a good measure of how large the app is, while the number of monthly active users could be inflated by spikes of activity during the month. So we settled on a formula for app quality that is primarily a function of its average rating as well as average daily active users.

咱們發現，每日活躍用戶的數量是證實app的數目是很是大的。而每個月活躍用戶數因爲在每個月的峯值誇大影響。因此，咱們設計了一個公式來計算app的質量平均得分和平均活躍用戶。

Algorithmic elements

From the algorithmic point of view, the App Center recommendation system has three major elements: candidate selection, scoring and ranking, and real-time updates.

從算法角度來看，App中心推薦系統的主要有三大部分：候補選擇(candidate selection)、評分和排名(scoring and ranking)和即時更新(real-time updates)。

The key to candidate selection is efficiency and high recall. We use several heuristics to choose promising candidates, the first being the selection of popular items based on a user’s demographic information. The second heuristic we use is the selection of social items, because we believe that people are generally interested in their friends’ activities. The third heuristic is to select items related to objects liked or interacted with by the user in the past.

候補選擇的關鍵時速度高效和高覆蓋，咱們用了多種算法來選擇有潛力有前途的備選app，候補選擇的第一機制是依據使用者的地理人口資訊來篩選，第二是依據朋友動態和使用狀況的社羣資訊，再者是依據使用者過去按贊或互動紀錄的行爲。

Once we obtain a set of candidates, we fetch their features from local storage and calculate ranking scores for them. A good scoring function should be able to capture high order interactions with three types of features.

一旦咱們獲得了一組候選app，咱們從本地存儲讀取它們的特徵，而後計算排名得分。一個很好的得分函數應該可以捕捉到至少三種類型的特色在高階上的相互做用。

The first type is explicit features we can obtain directly, like demographic information about the user. The second type is dynamic features such as number of likes and impressions for objects. The third type—learned latent features—is more interesting. These features are learned from the user-object interaction history, which can capture user preference and object flavor.

第一種特徵是顯示的，咱們能夠直接得到，好比用戶地理信息；第二種特徵是動態的，好比喜歡的次數；第三種特徵是潛在的特徵，這個很是有趣，這些特徵是從用戶在應用商店中交互行爲日誌，咱們能夠從這些日誌中分析出用戶的喜愛和app的「口味"。

The underlying principle of learning latent features is low-rank approximation of matrix. The basic problem is to find out the values of missing entries for the user object response matrix. The idea is to approximate the response matrix using the product of two low-rank matrices. Each row of matrix U is the latent representation of a user and captures the intrinsic taste of a user. Each column of matrix O is the latent representation of an object. It reflects the flavor of that object. The dot product between these two vectors is the predicted response from the user to the object.

學習潛在特徵的基本原理是矩陣的低秩逼近，就是矩陣分解。最根本的問題是要找出用戶app交互行爲矩陣中失蹤元素的值。咱們的想法是使用兩種低秩矩陣的乘積來近似響應矩陣。矩陣U的每一行是一個用戶的潛在表示和捕捉用戶的固有的喜愛。矩陣O的每一個列是一個對象的潛在表示，它反映了物體的口味。這兩個矢量之間的點積用來預測用戶和對象之間的關係。

Remember, we have more than 950 million users, and even more objects. Our matrix is huge, and the major challenge is how to learn the latent features efficiently. We developed algorithms to compute the latent traits given the huge amount of historical data and update them in real-time as new user feedback comes in.

請記住，咱們有超過950萬用戶，甚至更多的app應用。咱們的矩陣是巨大的，主要的挑戰是如何有效地學習潛在的特徵。咱們開發了算法來計算給出的大量歷史數據的潛在特徵，並經過實時的反饋來更新潛在特徵。

This ability to do real-time updates as new objects and events come in is one of the most important features of recommending the best apps for people. When feedback comes in, we need to do several things. One is to update the index so that new objects will be available for candidate selection. The new actions from each user are added to the index in real-time so that friends’ activities are immediately available for recommendation. We also update the user history so that we can make recommendations based on user’s latest activities. The dynamic features are also updated so that the current counts for shares, likes, impressions can be accurately used for scoring. The latent features are also updated in real-time, so that the system can learn user taste and object flavor based on latest activities.

隨着新的app發佈，以及用戶和app的交互以及反饋，實時的更新的能力是推薦引擎很是重要的一個特徵。當反饋消息傳遞過來後，咱們須要作幾件事情。第一是更新索引，使得新的對象將可用於候選選擇。朋友新產生的動做被索引來更及時的推薦，咱們同時更新用戶的行爲記錄以便逆用用戶最近的行爲計算推薦。動態的特徵也須要及時的更新好比分享次數，點贊數等，用來計算得分。同時，潛在的特徵也會被及時更新。

The App Center has been available to people worldwide since August 1, 2012, and we will continue to make updates, such as the recently launched My Apps page, as we build a personalized App Center and app recommendation service for each person on Facebook.

app應用商店自從2012年8月1號上線後，咱們持續的更新，好比咱們最近上線的個人app主頁，提供了個性化的界面以及推薦服務。

Wei Xu, Xin Liu, TR Vishwanath, and the open graph engineering team all worked together to integrate the recommendation engine and App Center.

相關標籤/搜索