The Universal Recommender

The Universal Recommender

The Universal Recommender (UR) is a new type of collaborative filtering recommender based on an algorithm that can use data from a wide variety of user taste indicators—it is called the Correlated Cross-Occurrence algorithm. Unlike the matrix factorization embodied in things like MLlib's ALS, The UR's CCO algorithm is able to ingest any number of user actions, events, profile data, and contextual information. It then serves results in a fast and scalable way. It also supports item properties for filtering and boosting recommendations and can therefor be considered a hybrid collaborative filtering and content-based recommender.架構

The use of multiple types of data fundamentally changes the way a recommender is used and, when employed correctly, will provide a significant increase in quality of recommendations vs. using only one user event. Most recommenders, for instance, can only use "purchase" events. Using all we know about a user and their context allows us to much better predict their preferences.dom

用戶單一行爲舉例

User Action Item
u1 view t1
u1 view t2
u1 view t3
u1 view t5
u2 view t1
u2 view t3
u2 view t4
u2 view t5
u3 view t2
u3 view t3
u3 view t5

整理後獲得如下關係:
u1=> [ t1, t2, t3, t5 ]
u2=> [ t1, t3, t4, t5 ]
u3=> [ t2, t3,t5 ]ide

個性化推薦的通常模型

$r=(P^{T}P)h_{p}$fetch

  • $r=rcommendations$
  • $h_{p}$= 某一用戶的歷史動做(好比購買動做)this

    • $h_{u1}=\begin{bmatrix}1 & 1 & 1 & 0 & 1\end{bmatrix}$
    • $h_{u2}=\begin{bmatrix}1 & 0 & 1 & 1 & 1\end{bmatrix}$
    • 針對某個item的動做在史來狀況下是有可能重複的,若是表達???

      $h_{u1}=\begin{bmatrix}1 & 2 & 1 & 0 & 1\end{bmatrix}$ 2表明了購買item2兩次spa

      若是這麼表示,那麼問題來了,近期的動做和久遠的動做,意義是不一樣的。偶爾受傷買個了拐,是不能根據這個動做就推薦拐的,LLR是否是能夠消減這類狀況呢?scala

  • $P$ = 歷史全部用戶的主動做(主事件)構成的矩陣rest

    • primary action:主動做在COO模型下才有意義,單一指標推薦,就無所謂主動做了。
    • 行表明矩陣, 列表明items

      $P=\begin{bmatrix}1 & 1 & 1& 0 & 1\\ 1 & 0 & 1 & 1 &1 \\ 0& 1& 1 & 0 & 1\end{bmatrix}$code

  • $(P^{T}P)$ = compares column to column using log-likelihood based correlation testorm

    • $P=\begin{bmatrix}1 & 1 & 1& 0 & 1\\ 1 & 0 & 1 & 1 &1 \\ 0& 1& 1 & 0 & 1\end{bmatrix}$ $P^{T}=\begin{bmatrix}1 & 1 &0 \\ 1& 0 & 1\\ 1& 1 &0 \\ 0&1 & 1\\ 1& 1 &1 \end{bmatrix}$ $P^{T}\cdot P=\begin{bmatrix}- & 1 & 2 & 1 & 2\\ 1& - & 2 &0 &2\\2& 2& - & 1 &3 \\1& 0 & 1 & - &1\\2&2&3&1 & -\end{bmatrix}$ $P^{T}表明矩陣轉置$
    • 其中$P^{T}\cdot P$ 中元素$C_{3,5}=3$ 表明有三個用戶瀏覽$t_{3}$ 的用戶同時瀏覽了$t_{5}$

COOCCURRENCE WITH LLR

  • Let's call ($P^{T}P$) an indicator matrix for some primary action like purchase

    • Rows = items, columns = items, element = similarity/correlation score
  • The score is row compared to column using a "similarity" or "correlation" metric
  • Log-likelihood Ratio(LLR對數似然比) finds important/correlating cooccurrences and filters out the rest —a major improvement in quality over simple cooccurrences or other similarity metrics.

    根據兩個事件的共現關係計算LLR值,用於衡量兩個事件的關聯度

    $P^{T}\cdot P=\begin{bmatrix}- & 1 & 2 & 1 & 2\\ 1& - & 1 &1 &1\\2& 1& - & 1 &2 \\1& 1 & 1 & - &1\\2&1&2&1 & -\end{bmatrix}\overset{LLR}{\rightarrow}\begin{bmatrix}-& 1.05 & 3.82 & 1.05 &3.82 \\ 1.05 & - &1.05 &1.05 &1.05 \\ 3.82& 1.05 & - & 1.05&3.82 \\1.05&1.05 &1.05 & - &1.05 \\3.82& 1.05 & 3.82 & 1.05&-\end{bmatrix}$

    注意:咱們發現每一個用戶都有點擊廣告a4,但a4的LLR值倒是0,也就是a4跟任何帖子都沒有關聯,這看上去很奇怪。但其實這是LLR的特色,LLR對於熱門事件有很大的懲罰,簡單來講它認爲瀏覽t1和點擊廣告a4這兩個事件共同發生的緣由不是由於瀏覽t1和點擊a4有關聯,而僅僅只是由於點擊a4自己是一個高頻發生的事件。
  • Experiments on real-world data show LLR is significantly better than ohter similarity metrics

LLR AND SIMILARITY METRICS PRECISION (MAP@K)

FROM COOCCURRENCE TO RECOMMENDATION

$r=(P^{t}P)h_{p}$

  • This actually means to take the user's history $h_{p}$ and compare it to rows of the cooccurrence matrix $(P^{t}P)$

    $h_{p}$ =P動做歷史行爲

  • TF-IDF weigthing of cooccurrence would be nice to mitigate the undue influence of popular items
  • Find items nearest to the user's history
  • Sort these by similarity strength and keep only the highest — you have recommendations
  • Sound familair? Find the k-nearest neighbors using cosine and TF-IDF?
  • That's exactly what a search engine does!

USER HISTORY + COOCCURRENCES + SEARCH = RECOMMENDATIONS

$r=(P^{t}P)h_{p}$

  • The final calculation uses $h_{p}$ as the query on the Cooccurrence matrix $(P^{T}P)$ , returns a ranked set of items
  • Query is a "similarity" query, not relational or key based fetch
  • Uses Search Engine as Cosine-based K-Nearest Neighbor(KNN) Engine with norms and TF-IDF weighting
  • Highly optimized for serving these queries in realtime
  • Serveral (Solr,Elasticsearch) have High Availability , massively scalable clustered auto-sharding features like the best of NoSQL DBs

UR的突破性思想

  • 幾乎全部的協同過濾推薦僅僅根據一個偏好指標計算所得:

    $r=(P^{t}P)h_{p}$

  • 基於 CCO 的協同過濾推薦能夠表示爲:

    $r=(P^{T}P)h_{p}+(P^{T}V)h_{v}+(P^{T}C)h_{c}+…$

    • $(P^{T}P)$ =P與P的關聯矩陣 $ (P^{T}V)$ =P與V的關聯矩陣
    • $(P^{T}V)h_{v}+(P^{T}C)h_{c}$ 表明了CROSS-OCCURRENCE
    • $h_{p}$ =P動做歷史行爲 $h_{v}$ =V動做歷史行爲
  • 基於COO推薦,只要咱們可以想到的用戶指標均可以提高推薦效果—購買行爲,觀看行爲,類別偏好,位置偏好,設備偏好,用戶性別...

CORRELATED CROSS-OCCURRENCE: SO WHAT?

  • Comparting the history of the primary action to other actions finds actions that lead to the one you want to recommend
  • Given strong data about user preferences on a general population we can also use

    • items clicked
    • terms searched
    • categories viewed
    • items shared
    • people followed
    • items disliked (yes dislikes may predict likes)
    • location
    • device perference,設備偏好
    • gender
    • age bracket,年齡段, people in the 10~20 age bracket
  • Virtually any anything we know about the population can be tested for correlation and used to predict a particular users preferences

CORRELATED CROSS-OCCURRENCE; ADDING CONTENT MODELS

  • Collaborative Topic Filtering

    • Use Latent Dirichlet Allocation(LDA) to model topics directly from the textual content
    • Calculate based on Word2Vec type word vectors instead of bag-of-words analysis to boost quality
    • Create cross-occurrence indicators from topics the user has preferred
    • Repeat periodically
  • Entity Preferences:

    • Use a Named Entity Recognition(NER) system to find entities in textual content
    • Create cross-occurrence indicators for these entities
  • Entities and Topics are long lived and richly describle user interests, these are very good for use in the Universal Recommender

THE UNIVERSAL RECOMMENDER ADDING CONTENT-BASED RECS

Indicators can also be based on content similarity

$r=(TT^{t})h_{t}+I\cdot L$

$(TT^{t})$ is a calculation that compares every 2 documents to each other and finds the most similar—based upon content alone

INDICATOR TYPES

  • Cooccurences

    • Find the best indicator of a user preference for the item type to be recommended: examples are "buy", "read", "video_watch", "share", "follow", "like"
  • Cross-occurrence

    • Item metadata as "user" preference, for example : treat item category as a user category-preferences
    • Calculated from user actions on any data that may give information about user— category-preferences, search terms, gender, location
    • Create with Mahout-Samsara SimilarityAnalysis.cooccurences
  • Content or metadata

    • Content text, tags, categories, description text , anything describing an item
    • Create with Mahout-Samsara SimilarityAnalysis.rowSimilarity
  • Intrinsic

    • Popularity rank, geo-location, anyting describing an item
    • Some may be derived from usage data like popularity rank , or hotness
    • Is a known or specially calculated property of the item

THE UNIVERSAL RECOMMENDER AKA THE WHOLE ENCHILADA

"Universal" means one query on all indicators at once

$r=(P^{T}P)h_{p}+(P^{T}V)h_{v}+(P^{T}C)h_{c}+…(TT^{T}h_{t})+I\cdot L$

Unified query:

  • purchase-correlator: users-history-of-purchase
  • view-correlator: user-history-of-views
  • category-correlator: user-history-of-categories-viewed
  • tags-correlator: user-history-of-purchases
  • geo-location-correlator: user-location
  • ...

Once indicators are indexed as search fields this entire equation is a single query

Fast!

THE UNIVERSAL RECOMMENDER: BETTER USER COVERAGE

  • Any number of user actions — entire user clickstream
  • Metadata—from user proflie or items
  • Context— on-site, time, location
  • Content— unstructured text or semi-structured categorical
  • Mixes any number of "indicators" to increase quality or tune to specific context
  • Solution to the "cold-start" problem—items with too short a lifespan or new users with no history

    how to solve ??
  • Can recommend to new users using realtime history
  • Can use new interaction data from any user in realtime
  • 95% implemented in Universal Recommender

    v0.3.0—most current release

POLISH THE APPLE

  • Dithering for auto-optimize via explore-exploit:

    Randomize some returned recs, if they are acted upon they become part of the new training data and are more likely to be recommended in the future

  • Visibility control:

    • Don't show dups, blacklist items already shown
    • Filter items the user has already seen
  • Zero-downtime Deployment: deploy prediction server once the hot-swap new index when ready
  • Generate some intrinsic indicators like hot, populay— helps solve the "cold-start" problem
  • Asymmetric train vs query—query with most recent user data, train on all historical data

基於PredictionIO的UR推薦架構

參考

  1. http://ssc.io/wp-content/uploads/2011/12/rec11-schelter.pdf
  2. http://actionml.com/docs/ur
  3. http://hejunhao.me/archives/1083
相關文章
相關標籤/搜索