An extension of the rational policy making algorithm to continuous state spaces

Kazuteru Miyazaki, Hajime Kimura, Shigenobu Kobayashi

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Reinforcement Learning is a kind of machine learning. We know Profit Sharing, the Rational Policy Making algorithm (RPM), the Penalty Avoiding Rational Policy Making algorithm and PS-r* to guarantee the rationality in a typical class of the Partially Observable Markov Decision Processes. However they cannot treat continuous state spaces. In this paper, we present a solution to adapt them in continuous state spaces. We give RPM a mechanism to treat continuous state spaces in the environment that has the same type of a reward. We show the effectiveness of the proposed method in numerical examples.

Original languageEnglish
Pages (from-to)332-341
Number of pages10
JournalTransactions of the Japanese Society for Artificial Intelligence
Volume22
Issue number3
DOIs
Publication statusPublished - Jan 1 2007

Fingerprint

Reinforcement learning
Learning systems
Profitability

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Cite this

An extension of the rational policy making algorithm to continuous state spaces. / Miyazaki, Kazuteru; Kimura, Hajime; Kobayashi, Shigenobu.

In: Transactions of the Japanese Society for Artificial Intelligence, Vol. 22, No. 3, 01.01.2007, p. 332-341.

Research output: Contribution to journalArticle

@article{b137c39c4b7249f7bf92941b8d48958f,
title = "An extension of the rational policy making algorithm to continuous state spaces",
abstract = "Reinforcement Learning is a kind of machine learning. We know Profit Sharing, the Rational Policy Making algorithm (RPM), the Penalty Avoiding Rational Policy Making algorithm and PS-r* to guarantee the rationality in a typical class of the Partially Observable Markov Decision Processes. However they cannot treat continuous state spaces. In this paper, we present a solution to adapt them in continuous state spaces. We give RPM a mechanism to treat continuous state spaces in the environment that has the same type of a reward. We show the effectiveness of the proposed method in numerical examples.",
author = "Kazuteru Miyazaki and Hajime Kimura and Shigenobu Kobayashi",
year = "2007",
month = "1",
day = "1",
doi = "10.1527/tjsai.22.332",
language = "English",
volume = "22",
pages = "332--341",
journal = "Transactions of the Japanese Society for Artificial Intelligence",
issn = "1346-0714",
publisher = "Japanese Society for Artificial Intelligence",
number = "3",

}

TY - JOUR

T1 - An extension of the rational policy making algorithm to continuous state spaces

AU - Miyazaki, Kazuteru

AU - Kimura, Hajime

AU - Kobayashi, Shigenobu

PY - 2007/1/1

Y1 - 2007/1/1

N2 - Reinforcement Learning is a kind of machine learning. We know Profit Sharing, the Rational Policy Making algorithm (RPM), the Penalty Avoiding Rational Policy Making algorithm and PS-r* to guarantee the rationality in a typical class of the Partially Observable Markov Decision Processes. However they cannot treat continuous state spaces. In this paper, we present a solution to adapt them in continuous state spaces. We give RPM a mechanism to treat continuous state spaces in the environment that has the same type of a reward. We show the effectiveness of the proposed method in numerical examples.

AB - Reinforcement Learning is a kind of machine learning. We know Profit Sharing, the Rational Policy Making algorithm (RPM), the Penalty Avoiding Rational Policy Making algorithm and PS-r* to guarantee the rationality in a typical class of the Partially Observable Markov Decision Processes. However they cannot treat continuous state spaces. In this paper, we present a solution to adapt them in continuous state spaces. We give RPM a mechanism to treat continuous state spaces in the environment that has the same type of a reward. We show the effectiveness of the proposed method in numerical examples.

UR - http://www.scopus.com/inward/record.url?scp=34247535460&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34247535460&partnerID=8YFLogxK

U2 - 10.1527/tjsai.22.332

DO - 10.1527/tjsai.22.332

M3 - Article

AN - SCOPUS:34247535460

VL - 22

SP - 332

EP - 341

JO - Transactions of the Japanese Society for Artificial Intelligence

JF - Transactions of the Japanese Society for Artificial Intelligence

SN - 1346-0714

IS - 3

ER -