[Scikit-learn] 1.9 Naive Bayes

時間 2019-11-18

標籤 scikit learn 1.9 naive bayes 简体版

原文原文鏈接

Ref: http://scikit-learn.org/stable/modules/naive_bayes.htmlhtml

1.9.1. Gaussian Naive Bayes

原理可參考：統計學習筆記（4）——樸素貝葉斯法 - 條件機率的應用數組

估計示範：X={有房=否，婚姻情況=已婚，年收入=120K}, 假設了「每一個條件都是獨立的」。dom

P(No) * P(有房=否|No) * P(婚姻情況=已婚|No) * P(年收入=120K|No) = 0.7 * 4/7 * 4/7 * 0.0072 = 0.0024post

P(Yes)* P(有房=否|Yes)* P(婚姻情況=已婚|Yes)* P(年收入=120K|Yes) = 0.3 * 1 * 0 * 1.2 * 10^-9 = 0學習

from sklearn import datasets iris = datasets.load_iris()
 from sklearn.naive_bayes import GaussianNB gnb = GaussianNB() y_pred = gnb.fit(iris.data, iris.target).predict(iris.data) print("Number of mislabeled points out of a total %d points : %d" % (iris.data.shape[0],(iris.target != y_pred).sum()))

Number of mislabeled points out of a total 150 points : 6

可見，有6個比較反常，與「你們」不一樣。spa

採用先驗分佈纔是正確的姿式。.net

1.9.2. Multinomial Naive Bayes

import numpy as np
# 0-4之間生成隨機數matrix X = np.random.randint(5, size=(6, 100)) y = np.array([1, 2, 3, 4, 5, 6])　　# X的六行對應六個類別
 from sklearn.naive_bayes import MultinomialNB clf = MultinomialNB() clf.fit(X, y)print(clf.predict(X[2:3]))　　# 參數必須數組，不能是單個值

1.9.3. Bernoulli Naive Bayes

import numpy as np X = np.random.randint(2, size=(6, 100)) Y = np.array([1, 2, 3, 4, 4, 5])
 from sklearn.naive_bayes import BernoulliNB clf = BernoulliNB() clf.fit(X, Y) print(clf.predict(X[2:3]))

Others: code

[ML] Naive Bayes for email classificationhtm

[ML] Naive Bayes for Text Classificationblog

Goto: [Scikit-learn] 1.1 Generalized Linear Models - Comparing various solvers then classifiers

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。