論文筆記之：Generative Adversarial Nets

時間 2019-11-09

標籤論文筆記 generative adversarial nets 简体版

原文原文鏈接

Generative Adversarial Nets
算法

NIPS 2014 app

　　摘要：本文經過對抗過程，提出了一種新的框架來預測產生式模型，咱們同時訓練兩個模型：一個產生式模型 G，該模型能夠抓住數據分佈；還有一個判別式模型 D 能夠預測來自訓練樣本　而不是 G 的樣本的機率．訓練 G 的目的是讓 D 儘量的犯錯誤，讓其沒法判斷一個圖像是產生的，仍是來自訓練樣本．這個框架對應了一個 minimax two-player game. 也就是，一方得勢，必然對應另外一方失勢，不存在兩方雙贏的局面，這個就是這個遊戲的規則和屬性。當任意函數 G 和 D的空間，存在一個特殊的解，G 恢復出訓練數據的分佈，D 在任何地方都等於 1/2 。當 G 和 D 定義爲 multilayer perceptrons, 整個系統能夠經過 BP 算法來進行訓練。在訓練或者產生樣本的過程當中，不須要馬爾科夫鏈或者 unrolled approximate inference network 。框架

　　引言：深度學習的但願是發現豐富的，等級模式，表明在人工只能應用中遇到的數據的分佈，像 natural images，audio waveforms 包含 speech, 天然語言庫的 symbols。到目前爲止，最有影響力的 DL 的應用已經涉及到 discriminative models，一般都是將高維，豐富的輸入到一個類別標籤。 Deep discriminative models 沒有那麼大的影響力，由於預測許多很難搞定的機率計算是至關困難的，例如：最大似然估計和相關的策略；因爲結合 piecewise linear units 的優點也很困難。咱們提出了一種新的 generative model estimation procedure 避開了這些困難。dom

　　在這個提出的 adversarial nets framework 中，產生式模型須要和一個敵手進行對抗：一個 discriminative model 須要學習是不是一個樣本是來自於 model distribution 或者是 data distribution 。這個產生式模型須要看做是造假的團伙，企圖製造假幣；而 discriminative model 相似於警察，試着檢查出假鈔。這個遊戲競爭的結果就是，使得兩個隊伍的不斷的改善其自身的模型，而產生的假鈔變成名副其實的藝術品。（作到真假難辨）ide

　　這個 framework 能夠產生用於許多類別的模型和優化算法特定的 training algorithm 。咱們探索一種特殊的狀況，稱爲 adversarial nets。函數

　　Adversarial nets : 學習

　　The adversarial modeling framework 是最直接的方式，當 models 都是多層感知機（multilayer perceptrons）。爲了在數據 x 上學習到 generator 的分佈 $p_g$，咱們在輸入 noise variable $p_z(z)$ 定義一個 prior，而後表示到 data space 的 $G(z; \theta_g)$ 一個 mapping，其中 G 是一個 differentiable function，由多層感知機 $D(x; \theta_d)$ 表示。D（x）表示 x 來自 data 而非 $p_g$ 的機率。咱們訓練 D 來最大化賦予 training example 和來自 G 的樣本的機率。咱們同時訓練 G 來最小化 $log(1-D(G(z))): $ 優化

　　換句話說，就是 D 和 G 採用下面的 two-player minimax game with value function V(G, D) :　　spa

　　在接下來的一節，咱們展現 adversarial nets 的理論分析，本質上展現了訓練的準則（training criterion）容許恢復出數據產生分佈 as G and D are given enough capacity, i.e. the non-parametric limit. 圖 1 給出了一個很好的展現，實際上，咱們必須以一種迭代的方式來進行這個遊戲。優化 D 在訓練的內部訓練中完成的代價是很是昂貴的，在有限的數據集上會致使 overfitting。相反，咱們相互間隔 k steps 來優化 D ，one step 來優化 G 。這使得 D 保持在其 optimal solution 附近，只要 G 改變的足夠緩慢。這個策略類比 SML/PCD training，這個過程總結在算法 1 中。 pwa

　　實際上，Equation 1 可能並無提供足夠的梯度來使得 G 學習的足夠好。在學習的早期，G 是 poor 的，D 能夠高置信度的方式 reject samples，由於他們和原始數據很明顯不相同。在這種狀況下，$log(1-D(G(z)))$ saturates （飽和了）。Rather than training G to minimize $log(1-D(G(z)))$ , 咱們能夠訓練 G 來最大化 $log D(G(z))$ 。這個目標函數 results in the same fixed point of the dynamics of G and D but provides much stronger gradients early in learning . （在早期，提供了很是強的梯度信息）　　

　　圖 1. 這四個小圖展現了對抗訓練的過程。其中，這幾條線的意思分別是：

　　------ the discriminative distribution (D, blue, dashed line) 藍色的虛線表示判別式的分佈；

　　------ the data generating distribution (black, dotted line) $p_x$ 黑色的點線表示數據產生的分佈；

　　------ the generative distribution $p_g (G)$ 綠色的實線。

　　------ the lower horizontal line is the domain from which z is sampled . 　　底部的水平線是採樣 z 的 domain

　　------ the horizontal line above is part of the domain of x . 　　上部的水平線是 x domain 的部分。

　　------ the upward arrows show the mapping x = G(z) imposes the non-uniform distribution $p_g$ on transformed samples. 　　向上的箭頭展現了 mapping x = G(z)，這個映射是非均勻分佈到轉換的samples。

　　（a）考慮一個接近收斂的對抗 pair。$p_g$ 和 $p_{data}$ 類似；D 是一個有必定準確性的 classifier。

　　（b）在算法 D 的內部循環被訓練用來從數據中判斷出 samples，收斂到 $D^*(x) = \frac{p_{data}(x)}{p_{data}(x) + p_g(x)}$ 。

　　（c）在更新 G 以後，D 的梯度已經引導 G(z) to flow to regions that are more likely to be classified as data.

　　（d）在幾回訓練以後，若是 G 和 D 有足夠的能力，他們會達到一個平衡，使得二者都已經沒法進一步的提高自我，即：$p_g = p_{data}$ 。這個時候，discriminator 已經沒法判別兩個分佈的區別，也就是說，此時的 D(x) = 1/2 。

　　Theoretical Results .

　　做者代表 the minimax game has a global optimum for $p_g = p_{data}$。

　　Global Optimality of $p_g = p_{data}$：

　　對於任意一個 generator G，咱們考慮最優的 discriminator D 。

　　Proposition 1 . 對於 fixed G，最優的 discriminator D 是：

　　Proof . 對於判別器 D 的訓練準則，給定任意的 generator G，爲了最大化 quantity V(G, D)

　　對於任意的 $ (a, b) \in R^2 \ {0, 0} $，函數 y ->a log(y) + b log(1-y) 在 $\frac{a}{a+b}$ 達到其最大值。The discriminator 不須要在 $Supp (p_{data} U Supp(p_g))$ 以外進行定義。

　　訓練 D 的目標能夠表達爲：maximizing the log-likelihood for estimating the conditional probability $P(Y = y|x)$，其中 Y 表示是否 x 來自於 $p_{data}$ (with y = 1) 仍是 $p_g$ （with y = 0）。Equation 1 的 minimax game 能夠表達爲：

　　Experiments :

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。