An online policy gradient algorithm for Markov decision processes with continuous states and actions
Yao Ma, Tingting Zhao, Kohei Hatano, Masashi Sugiyama
研究成果: ジャーナルへの寄稿 › レター › 査読
1
被引用数
(Scopus)