TY - JOUR

T1 - Threshold probability of non-terminal type in finite horizon Markov decision processes

AU - Kira, Akifumi

AU - Ueno, Takayuki

AU - Fujita, Toshiharu

N1 - Funding Information:
The authors wish to thank Professor Hidefumi Kawasaki for his valuable advice regarding this investigation. We are also grateful to Professor Seiichi Iwamoto for his constant support. He introduced us to the study of the theory of dynamic programming. This research was supported in part by a Grant-in-aid for JSPS Fellows. We would like to thank the anonymous reviewer for useful comments and suggestions.

PY - 2012/2/1

Y1 - 2012/2/1

N2 - We consider a class of problems concerned with maximizing probabilities, given stage-wise targets, which generalizes the standard threshold probability problem in Markov decision processes. The objective function is the probability that, at all stages, the associatively combined accumulation of rewards earned up to that point takes its value in a specified stage-wise interval. It is shown that this class reduces to the case of the nonnegative-valued multiplicative criterion through an invariant imbedding technique. We derive a recursive formula for the optimal value function and an effective method for obtaining the optimal policies.

AB - We consider a class of problems concerned with maximizing probabilities, given stage-wise targets, which generalizes the standard threshold probability problem in Markov decision processes. The objective function is the probability that, at all stages, the associatively combined accumulation of rewards earned up to that point takes its value in a specified stage-wise interval. It is shown that this class reduces to the case of the nonnegative-valued multiplicative criterion through an invariant imbedding technique. We derive a recursive formula for the optimal value function and an effective method for obtaining the optimal policies.

UR - http://www.scopus.com/inward/record.url?scp=80052842288&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80052842288&partnerID=8YFLogxK

U2 - 10.1016/j.jmaa.2011.08.006

DO - 10.1016/j.jmaa.2011.08.006

M3 - Article

AN - SCOPUS:80052842288

VL - 386

SP - 461

EP - 472

JO - Journal of Mathematical Analysis and Applications

JF - Journal of Mathematical Analysis and Applications

SN - 0022-247X

IS - 1

ER -