cs294-RL introduction

時間 2021-01-16

標籤 cs294 強化學習简体版

原文原文鏈接

強化學習的種類 model-based RL 值函數 policy gradient actor-critic： value function plus policy gradients 爲什麼要有那麼多的RL算法？協調因素：採樣高效、穩定不同假設：隨機或確定、連續or離散、episode or infinite horizon 難度不同：策略展示簡單還是模型展示簡單採樣高效、on-poli

>>阅读原文<<

1. Introduction
2. ProGuard Introduction
3. Spring Introduction
4. Grafana introduction
5. Lecture1: Introduction
6. Week1:Introduction
7. ffos:ffos introduction
8. LLVM Introduction
9. Beamer Introduction
10. Solr: Introduction
更多相關文章...
• Web 品質 - 重要的 HTML 元素 - 網站品質教程
• XLink 實例 - XLink 和 XPointer 教程

相關標籤/搜索

introduction

1.introduction