這是一個測試微博node
擴展實值函數\(f: \mathbb{R}^n \mapsto [-\infty, \infty]\)的共軛凸函數\(f^*: \mathbb{R}^n \mapsto [-\infty, \infty]\)爲:
\[ \begin{align*} f^*(\boldsymbol{y}) = \sup_{\boldsymbol{x} \in \mathbb{R}^n} \{ \boldsymbol{x}^\top \boldsymbol{y} - f(\boldsymbol{x}) \}, \ \boldsymbol{y} \in \mathbb{R}^n. \end{align*} \]python
[6]這篇文章描述了佈雷格曼散度的適用場景。git
If you have some abstract way of measuring the 「distance」 between any two points and, for any choice of distribution over points the mean point minimises the average distance to all the others, then your distance measure must be a Bregman divergence.github
佈雷格曼散度在矩陣上的擴充定義:web
SVD分解的原理推導和python實如今我以前的博客裏詳細介紹過。請參考下面這個連接:算法
In domains with more than one relation matrix, one could fit each relation separately; however, this approach would not take advantage of any correlations between relations. For example, a domain with users, movies, and genres might have two relations: an integer matrix representing users’ ratings of movies on a scale of 1{5, and a binary matrix representing the genres each movie belongs to. If users tend to rate dramas higher than comedies, we would like to exploit this correlation to improve prediction.app
(From reference [3])less
運用牛頓法進行參數估計。獲得U,V,Z。dom
CMF一個缺點是:
\[ X=UV^T\\ Y=VW^T \]
這意味着,中間那個類型的實體,其latent factor在不一樣的contexts中是相同的V。
1.若是它在其中一個contexts中存在冷啓動,那麼這個V的學習主要是從另外一個上下文中得到的。
2.即使沒有冷啓動,若是兩個上下文數據不均衡,V也是由dominant context決定。
讀論文中,關於似然函數書寫的一點心得。
你能猜出來接下來他要幹嗎嗎?
EM algorithm
[1] X. Yu, et al., 「Recommendation in heterogeneous information networks with implicit user feedback,」 in Proc. 7th ACM Conf. Recommender Syst., 2013, pp. 347–350.
[2] Y. Sun, J. Han, X. Yan, S. P. Yu, and T. Wu. PathSim: Meta Path-Based Top-K Similarity Search in Heterogeneous Information Networks. In VLDB, 2011
[3] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. Bpr: Bayesian personalized ranking from implicit feedback. In UAI,2009.
[1]首先把觀測到的用戶傾向沿着可能的元路徑進行擴散[2]。而後使用矩陣分解的技術對擴散矩陣進行分解,並獲得相應用戶和項目的隱含特徵。接着,基於這些隱含特徵構建一個混合推薦模型,並採用貝葉斯排名優化技術[3]進行參數估計。
[1]的缺點是沒有像HeteroMF那樣考慮到多個上下文
參考前面說過的HeteSim。這個公式中\(:\)至關於\(|\),分母是本身到本身的路徑。分母是到彼此的路徑(Tom->Mary, Mary->Tom,因此要乘以2)。
用PathSim或相似的類似性測量方法獲得item pairs的類似性矩陣
By measuring the similarity of all item pairs with one meta-path, we can generate a symmetric similarity matrix, denoted as \(S \in R^{n×n}\). With \(L\) different meta-paths, we can calculate \(L\) different similarity matrices with different semantics accordingly, denoted as \(S^{(1)}, S^{(2)}, \cdots, S^{(L)}\).
對原始的稀疏矩陣\(R\)進行補全(原文把它稱爲diffuse)
補充:
這個方法也能夠用在傳統SVD推薦中。能夠參考一下。傳統SVD對缺失值補全方法:
1,賦值爲0
2,賦值爲均值
3,本文的方法
By repeating the above process for all L similarity matrices, we can now generate L pairs of representations of users and items$ \left(U^{(1)}, V ^{ (1)},\cdots, U^{ (L)}, V ^{(L)}\right)$. Each low-rank pair represent users and items under a specific similarity semantics due to the user preference diffusion process. Considering that different similarity semantics could have different importance when making recommendations, we define the recommendation model as follows:
在上一節的公式3中,咱們獲得的是用戶對某一個商品的估計評分。
咱們的最終目標是但願向用戶推薦topk個商品,而且這些商品要按照可能性來進行rank。所以接下來的任務,是構建一個衡量用戶對給定排序承認程度的指標。也就是特定排序下的機率[3]。
We use \(p(e_a > e_b; u_i|\theta)\) to denote the probability that user \(u_i\) prefers \(e_a\) over \(e_b\).
具體的說:
\[ p(e_a > e_b; u_i|\theta)=logistic\left( \hat {r}\left(u_i,e_a\right)- \hat {r}\left(u_i,e_b\right)\right) \]
likelihood function:
objective function:
隨機梯度降低估計參數
[1] C. Shi, C. Zhou, X. Kong, P. S. Yu, G. Liu, and B. Wang, 「HeteRecom: A semantic-based recommendation system in heterogeneous networks,」 in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 1552–1555.
[2] C. Shi, X. Kong, P. S. Yu, and S. Xie. Relevance search in heterogeneous networks. In EDBT, 2012.
[3] C. Shi, Z. Zhang, P. Luo, P. S. Yu, Y. Yue, and B. Wu, 「Semantic path based personalized recommendation on weighted heterogeneous information networks,」 in Proc. 24th ACM Int. Conf. Inf. Knowl. Manage., 2015, pp. 453–462
[1]這篇論文中,做者運用基於路徑的相關性測量構建了一個異構網絡下的 non-personalized recommendation.論文測量類似性的方法來自於[2]HeteSim
[1]這篇論文中,做者運用基於路徑的相關性測量構建了一個異構網絡下的 non-personalized recommendation. 該推薦系統能夠作semantic recommendation和relevance recommendation .以下圖所示:
- Data extraction: it extracts data from different data source (e.g., database and web) to construct the network.
- Network modeling: it constructs the HIN with a given network schema. According to the structure of data, users can specify the network schema (e.g., bipartite, star or arbitrary schema) to construct the HIN database. The database provides the store and index functions of the node table and edge table of the HIN.
- Network analysis: it analyzes the HIN and provides the recommendation services. It first computes and stores the relevance matrix of object pairs by the path-based relevance measure. Based on the relevance matrix and efficient computing strategies, the system can provide the online semantic recommendation service. Through the weight learning method, it can combine the relevance information from different semantic paths and provide online relevance recommendation service.
- Recommendation service: it provides the succinct and friendly interface of recommendation services.
- Essentially, HeteSim(s; t|P) is a pair-wise random walk based measure, which evaluates how likely s and t will meet at the same node when s follows along the path and t goes against the path.
- Since relevance paths embody different semantics, users can specify the path according to their intents. The semantic recommendation calculates the relevance matrix with HeteSim and recommends the top k objects.
舉一個例子:
There are many relevance paths connecting the query object and related objects, so the relevance recommendation should comprehensively consider the relevance measures based on all relevance paths. It can be depicted as follows.
Although there can be infinite relevance paths connecting two objects, we only need to consider those short
paths, since the long paths are usually less important
如今的問題是咱們如何決定\(\omega_i\).論文認爲,\(\omega_i\)由關聯路徑的重要性來表達。而關聯路徑的重要性能夠用這個路徑的長度和強度來表達。路徑的強度能夠由組成該路徑的關係強度來表達。
關係強度:
where \(O(A|R)\) is the average out-degree of type \(A\) and \(I(B|R)\) is the average in-degree of type \(B\) based on relation \(R\)
路徑強度:
關聯路徑重要性:
權重:
在第2部分中,咱們介紹過PathSim,可是它的測量是沒有權重的。[3]將PathSim的方案擴展到了加權元路徑中。
可是,u1和u2從情景中來看,應該是最不相同的(u1喜歡的,u2都不喜歡),但是在這裏它們的類似度爲1,這是由於咱們僅僅考慮了路徑個數,卻沒有考慮路徑的分數值(權重)。所以本文對傳統的這種類似性測量方案提出了改進。改進的措施是:咱們將路徑按照評分值(權重)進行分類,每個類別的路徑叫作atomic meta path。當咱們考慮評分值爲1的路徑時,咱們就假定其餘路徑不存在。而後用傳統的計算方案獲得u1和u2在評分值爲1這個路徑集下的類似度。接着考慮評分值爲2,...,5的路徑。能夠一次計算u1和u2對應的類似度。最後咱們把這些類似度進行加和,獲得改進後的類似度,如圖:
[1.139] N. Srebro and T. Jaakkola, 「Weighted low-rank approximations,」 in Proc. 20th Int. Conf. Mach. Learn., 2003, pp. 720–727.
[1.141] X. Yang, H. Steck, and Y. Liu, 「Circle-based recommendation in online social networks,」 in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 1267–1275.
孫相國說:
對於豆瓣數據集來講,情景感知這一方面,是能夠明顯獲得的,即用戶的動做流就是一種。而對於trust-aware(信任感知),我覺得,除了VIP之間的網絡也許有體現外,非VIP也許並無什麼體現。由於豆瓣整體上來看,社交功能並非很是發達。用戶在豆瓣上進行社交的週期也不長。
本文在方法上,隸屬於trust-aware這一範疇。這裏的「圈」,circle,實際上就是基於特定用戶視角下的group,因爲從已有數據提取circle難度很是大(Unfortunately, in most existing multi-category rating datasets, a user’s social connections from all categories are mixed together.even if the circles were explicitly known,they may not correspond to particular item categories that a recommender system may be concerned with. )所以,本文的貢獻在於,提供了推斷circle的方法。並在此基礎上,創建基於trust-aware的推薦系統。
We propose a set of algorithms to infer category specific circles of friends and to infer the trust value on each link based on user rating activities in each category. To infer the trust value of a link in a circle, we first estimate a user’s expertise level in a category based on the rating activities of herself as well as all users trusting her. We then assign to users trust values proportional to their expertise levels. The reconstructed trust circles are used to develop a low-rank matrix factorization type of RS.
關於矩陣分解,論文在這裏表達爲:
\(\hat{R}=r_m+QP^T\)
這樣最小化目標函數爲
\(\frac{1}{2}\sum_{(u,i)\in obs.}(R_{u,i}-\hat{R}_{u,i})^2+\frac{\lambda}{2}(\left \|P \right \|_F^2)+\left \|Q \right \|_F^2\)
其中\(r_m\)是基準預測,相關概念能夠參考《推薦系統:技術、評估及高效算法》104頁的5.3節.
本文在實驗關節對比的baseline model是第36篇參考文獻[18]A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD’08), pages 650–658, 2008. 中提出的SocialMF Model.
The social network information is represented by a matrix \(S \in \mathbb{R}^{u_0 \times u_0}\), where \(u_0\) is the number of users. The directed and weighted social relationship of user \(u\) with user \(v\) (e.g. user \(u\) trusts/knows/follows user \(v\) ) is represented by a positive value \(S_{u,v} \in (0, 1]\). An absent or unobserved social relationship is reflected by \(S_{u,v} = s_m\), where typically \(s_m = 0\). Each of the rows of the social network matrix \(S\) is normalized to 1, resulting in the new matrix \(S^∗\) with \(S_{u,v}^∗ \propto S_{u,v}\)and \(\sum_v S_{u,v}^∗ = 1\) for each user \(u\) . The idea underlying SocialMF is that neighbors in the social network may have similar interests. This similarity is enforced by the second term in the following objective function, which says that user profile \(Q_u\) should be similar to the (weighted) average of his/her friends’ profiles \(Q_v\) (measured in terms of the square error):
\[ \frac{1}{2}\sum_{(u,i)\in obs.}(R_{u,i}-\hat{R}_{u,i})^2+\frac{\beta}{2}\sum_{u}\left((Q_u-\sum_vS^*_{u,v}Q_v)(Q_u-\sum_vS^*_{u,v}Q_v)^T\right)+\frac{\lambda}{2}(\left \|P \right \|_F^2)+\left \|Q \right \|_F^2 \tag{2} \]
上式第二項中,\(Q_u\)是用戶\(u\)在隱因子空間中的點,\(\sum_vS^*_{u,v}Q_v\)表達的是用戶\(u\)的鄰居加權後的點(權重是\(u\)對鄰居的信任值),論文認爲用戶的鄰居的加權應該與用戶自己類似,所以第二項用最小平方偏差來表達。咱們能夠採用隨機梯度降低法來對目標函數進行參數估計。
本文認爲,一個用戶在某些類別中,可能會信任他的某個朋友,但未必在其餘類別中,也信任一樣的這個朋友。(孫相國:咱們也許能夠根據電影類別,來劃分R矩陣,正如同本文根據category來劃分R矩陣同樣)
這一節,論文提供了三種獲取設定信任值的方法。
\[ S^{(c)*}_{u,v}=\frac{1}{|\mathcal{C}_u^{(c)}|},\forall v \in \mathcal{C}_u^{(c)} \tag{3} \]
The goal is to assign a higher trust value or weight to the friends that are experts in the circle / category. As an approximation to their level of expertise, we use the numbers of ratings they assigned to items in the category. The idea is that an expert in a category may have rated more items in that category than users who are not experts in that category.
設用戶\(u\)在\(c\)領域信任的朋友集合爲\(\mathcal{C}_u^{(c)}\)(\(u\)的信任圈circle),在\(c\)領域信任\(u\)的用戶集合爲\(\mathcal{F}_u^{(c)}\),\(u\)在領域\(c\)的專家水平定義爲\(E_u^{(c)}\).
方法1:
認爲用戶\(v\)的專家水平等於其在領域的ratings數量:\(E_v^{(c)}=N_v^{(c)}\),這樣有:
\[ S_{u,v}^{(c)}=\left\{\begin{matrix} N_v^{(c)},v\in \mathcal{C}_u^c\\ 0,otherwise \end{matrix}\right. \tag{4} \]
歸一化獲得:
\[ S_{u,v}^{(c)*}=\frac{S_{u,v}^{(c)}}{\sum_v S_{u,v}^{(c)}}\tag{5} \]
方法2:
In this case, the expertise level of user \(v\) in category \(c\) is the product of two components: the first component is the number of ratings that \(v\) assigned in category \(c\), the second component is some voting value in category \(c\) from all her followers in \(F_v (c)\). The intuition is that if most of \(v\)’s followers have lots of ratings in category \(c\), and they all trust \(v\), it is a good indication that \(v\) is an expert in category \(c\).
從(1)(2)咱們看到,\(u\)對\(v\)的信任值,是領域獨立的。也就是說,咱們考慮\(u\)在領域\(c\)信任\(v\)時,只考慮$ c\(領域,可是領域之間獨立真的好嗎?看一個例子:若是\)u\(和\)v\(都在\)c_1,c_2\(有過ratings.而且\)v\(在\)c_1\(的rating數量遠多於其在\)c_2\(中的數量,那麼,\)u\(對\)v\(的信任彷佛在\)c_1\(領域應該多於在\)c_2$領域。但是咱們以前的兩種方案,都沒有考慮這一點,這就會形成,兩個領域的信任值相比較於實際不符合。所以這裏提出第三種方案:
\[ S_{u,v}^{(c)}=\left\{\begin{matrix} \frac{N_v^{(c)}}{\sum_c N_v^{(c)}},v\in \mathcal{C}_u^c\\ 0,otherwise \end{matrix}\right. \tag{6} \]
上面公式的意思是,不只僅只考慮當前領域\(c\),還要對共同領域作一個橫向比較,以此來肯定相對大小。固然一樣的,接下來仍是要作歸一化:
\[ S_{u,v}^{(c)*}=\frac{S_{u,v}^{(c)}}{\sum_v S_{u,v}^{(c)}}\tag{7} \]
參見2.2節的公式2,咱們如今要用這個公式分別對各個領域\(c\)進行訓練,所以有:
採用梯度降低法進行參數估計
考慮到數據的稀疏性,咱們但願把上面的訓練公式中第一行擴展到全局。即: