AlphaGo原理淺析

時間 2021-01-21

標籤強化學習简体版

原文原文鏈接

AlphaGo 論文：　　AlphaGo:《Mastering the game of Go with deep neural networks and tree search》核心部件：　　- 監督學習的策略網絡(SL policy network) 　　　　13layers CNN network 　　　　輸入：當前state 　　　　輸出：所有可能action的概率分佈　　　　更新策

>>阅读原文<<