AI - Reinforcement

MDP Markov Decision Process MDP (Markov Decision Process) Created with Raphaël 2.1.2 State Space Action Space Transition Function Reward Function State: S Action: A Tansition Function T(s,a,s′)=P(St+1
相關文章
相關標籤/搜索