強化學習論文——Policy invariance under reward transformations: Theory and application to reward shaping

Policy invariance under reward transformations: Theory and application to reward shaping 這篇文章是獎勵塑造的重要理論基礎,對獎勵函數的設計具有指導作用,作者有吳恩達,地址http://luthuli.cs.uiuc.edu/~daf/courses/games/AIpapers/ng99policy.pdf
相關文章
相關標籤/搜索