雖然openBugs效果不錯,但原理是什麼呢?須要感性認識,才能得其精髓。html
Recall [Bayes] prod: M-H: Independence Sampler firstly. 採樣法算法
Recall [ML] How to implement a neural network then. 梯度降低法app
And compare them.函數
梯度降低,其實就是減少loss function,不斷逼近擬合的過程。post
那採樣法呢?學習
y = a*x +sigma, where sigma~N(0, tao^2)測試
r <- prod(fy/fx)* # 求原分佈的 似然比 以及 提議分佈的 比值 g(xt[i-1]) / g(y[i]) # 提議分佈if (u[i] <= r) xt[i] <- y[i] else { xt[i] <- xt[i-1]
難道是貝葉斯線性迴歸?ui
採樣方法還有一個好處,就是精度算得比誰都高,可是畢竟仍是too slow too simple~this
From: https://zhuanlan.zhihu.com/p/20753438url
Figure 01, rewrite PGM
結論:
經過sample 得到後驗的一堆隨機點,根據這些隨機點再計算/推斷出後驗分佈的各類統計量。
The terminology "fully Bayesian approach" is nothing but a way to indicate that one moves from a "partially" Bayesian approach to a "true" Bayesian approach, depending on the context.
Or to distinguish a "pseudo-Bayesian" approach from a "strictly" Bayesian approach.
For example one author writes: "Unlike the majority of other authors interested who typically used an Empirical Bayes approach for RVM, we adopt a fully Bayesian approach" beacuse the empirical Bayes approach is a "pseudo-Bayesian" approach.
實質是:利用歷史樣本對先驗分佈或者先驗分佈的某些數字特徵作出直接或間接的估計,是對貝葉斯方法的改進和推廣,是介於經典統計學和貝葉斯統計學之間的一種統計推斷方法。
There are others pseudo-Bayesian approaches, such as the Bayesian-frequentist predictive distribution (a distribution whose quantiles match the bounds of the frequentist prediction intervals).
In this page several R packages for Bayesian inference are presented. The MCMCglmm is presented as a "fully Bayesian approach" because the user has to choose the prior distribution, contrary to the other packages.
Another possible meaning of "fully Bayesian" is when one performs a Bayesian inference derived from the Bayesian decision theory framework, that is, derived from a loss function, because Bayesian decision theory is a solid foundational framework for Bayesian inference.
I think the terminology is used to distinguish between the Bayesian approach and the empirical Bayes approach.
Full Bayes uses a specified prior whereas empirical Bayes allows the prior to be estimated through use of data.
全貝葉斯:使用指定的先驗
經驗貝葉斯:使用數據估算來的先驗
未知量不少時,好比有n個。
先討論nth的變量,那麼先設定n-1個變量的值,怎麼給,由於有預先假設的分佈,故,從分佈上隨機取一個點。
注意有三個值,以及這三個值的關係:
樣本中的(x, y)以及nth variable.
根據nth的值(是在假設的分佈下隨機取的),求出在已知樣本數據(x,y)下的似然值。
一開始,極可能擬合的很差,即:似然值很小。那麼調整假設分佈的參數,好比正態分佈的mu,使似然值達到「當前狀況下」的所謂的最大。
調整後,在這個新分佈下去一個值做爲固定值,而後再考慮下一個變量的狀況。
這裏的Gibber只是表明一個算法思路,跟sampling貌似關係不大。
變量變多時,貌似過程會複雜不少,但機率圖模型的做用就是告訴咱們 (Figure 01, rewrite PGM),估計一個變量D時,不必考慮 all rest,在當前情形下,只考慮B, C便可。這便大大的簡化了計算時間。
這裏注意到了一點與neutral network的一些區別:
Bayes方法若是參數不少怎麼辦?畢竟一次只能改變一個變量,若是是圖片的話,即便是一個像素點一個變量,都是巨大的數量。
而neutral network 的 back propagation是一次調整衆多的值。