mahout是機器學習的一個工具,裏面封裝了大量的機器學習的算法。 java
算法類 git |
算法名 github |
中文名 算法 |
分類算法 apache |
Logistic Regression 網絡 |
邏輯迴歸 框架 |
Bayesian dom |
貝葉斯 機器學習 |
|
SVM 工具 |
支持向量機 |
|
Perceptron |
感知器算法 |
|
Neural Network |
神經網絡 |
|
Random Forests |
隨機森林 |
|
Restricted Boltzmann Machines |
有限波爾茲曼機 |
|
聚類算法 |
Canopy Clustering |
Canopy聚類 |
K-means Clustering |
K均值算法 |
|
Fuzzy K-means |
模糊K均值 |
|
Expectation Maximization |
EM聚類(指望最大化聚類) |
|
Mean Shift Clustering |
均值漂移聚類 |
|
Hierarchical Clustering |
層次聚類 |
|
Dirichlet Process Clustering |
狄裏克雷過程聚類 |
|
Latent Dirichlet Allocation |
LDA聚類 |
|
Spectral Clustering |
譜聚類 |
|
關聯規則挖掘 |
Parallel FP Growth Algorithm |
並行FP Growth算法 |
迴歸 |
Locally Weighted Linear Regression |
局部加權線性迴歸 |
降維/維約簡 |
Singular Value Decomposition |
奇異值分解 |
Principal Components Analysis |
主成分分析 |
|
Independent Component Analysis |
獨立成分分析 |
|
Gaussian Discriminative Analysis |
高斯判別分析 |
|
進化算法 |
並行化了Watchmaker框架 |
|
推薦/協同過濾 |
Non-distributed recommenders |
Taste(UserCF, ItemCF, SlopeOne) |
Distributed Recommenders |
ItemCF |
|
向量類似度計算 |
RowSimilarityJob |
計算列間類似度 |
VectorDistanceJob |
計算向量間距離 |
|
非Map-Reduce算法 |
Hidden Markov Models |
隱馬爾科夫模型 |
集合方法擴展 |
Collections |
擴展了java的Collections類 |
package mahout; import java.io.File; import java.util.List; import org.apache.mahout.cf.taste.impl.model.file.FileDataModel; import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood; import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender; import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity; import org.apache.mahout.cf.taste.model.DataModel; import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood; import org.apache.mahout.cf.taste.recommender.RecommendedItem; import org.apache.mahout.cf.taste.recommender.Recommender; import org.apache.mahout.cf.taste.similarity.UserSimilarity; public class UserRecommer { public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("xxx/intro.csv")); // 皮爾遜類似度算法。其餘的還有好多類似度算法 UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); // 生成推薦系統 Recommender recommender = new GenericUserBasedRecommender(model, neighborhood, similarity); // 爲用戶1推薦物品1 List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }結果以下:RecommendedItem[item:104, value:4.257081]
intro.csv文件內容: 1,101,5.0 1,102,3.0 1,103,2.5 2,101,2.0 2,102,2.5 2,103,5.0 2,104,2.0 3,101,2.5 3,104,4.0 3,105,4.5 3,107,5.0 4,101,5.0 4,103,3.0 4,104,4.5 4,106,4.0 5,101,4.0 5,102,3.0 5,103,2.0 5,104,4.0 5,105,3.5 5,106,4.0