An online policy gradient algorithm for Markov decision processes with continuous states and actions

Yao Ma, Tingting Zhao, Kohei Hatano, Masashi Sugiyama

Research output: Contribution to journalLetterpeer-review

1 Citation (Scopus)

Fingerprint

Dive into the research topics of 'An online policy gradient algorithm for Markov decision processes with continuous states and actions'. Together they form a unique fingerprint.

Mathematics

Computer Science