CS231N-14-Reinforcement Learning

What is Reinforcement Learning Markov Decision Process MDP Value Function Q-value Function Bellman Equation Q-learning Policy Gradient 最後一節。 So far, we have mainly talked about supervised learning lik
相關文章
相關標籤/搜索