David Silver《Reinforcement Learning》課程解讀—— Lecture 5： Model-Free Control

時間 2021-01-11

原文原文鏈接

David Silver《Reinforcement Learning》課程解讀—— Lecture 5： Model-Free Control 上次課談到了在給定policy的情況下求解未知environment的MDP問題，稱之爲Model-Free Prediction問題。本節則是解決未知policy情況下未知environment的MDP問題，也就是Model-Free Control問

>>阅读原文<<