Research Output 1991 2020

Filter
Letter
2016
1 Citation (Scopus)

An online policy gradient algorithm for Markov decision processes with continuous states and actions

Ma, Y., Zhao, T., Hatano, K. & Sugiyama, M., Mar 1 2016, In : Neural Computation. 28, 3, p. 563-593 31 p.

Research output: Contribution to journalLetter

Markov Chains
Learning
Emotions
Reward
Decision Making
2011

福島が関東への電力供給地になった時

宮地英敏, Nov 2011, In : 書斎の窓. 609, p. 37-41 5 p.

Research output: Contribution to journalLetter