Reinforcement learning: integrating learning and planning, exploitation and exploration

時間 2020-12-29

標籤 UCL exploitati Model 简体版

原文原文鏈接

介紹基於模型的RL 整體框架基於仿真的搜索 Exploration and Exploitation 介紹越看到後面，我越發覺得RL更像是一種思想，Policy，State都需要自己進行定義，計算value function的過程也有公式，但是不如深度學習那麼直接。之前的章節是說到如何從經驗中得到policy和value function，這一節是如何從經驗中獲取模型。然後使用模型加經驗來

>>阅读原文<<