TY - JOUR
T1 - Taming decentralized POMDPs
T2 - 18th International Joint Conference on Artificial Intelligence, IJCAI 2003
AU - Nair, R.
AU - Tambe, M.
AU - Yokoo, M.
AU - Pynadath, D.
AU - Marsella, S.
PY - 2003/12/1
Y1 - 2003/12/1
N2 - The problem of deriving joint policies for a group of agents that maximize some joint reward function can be modeled as a decentralized partially observable Markov decision process (POMDP). Yet, despite the growing importance and applications of decentralized POMDP models in the multiagents arena, few algorithms have been developed for efficiently deriving joint policies for these models. This paper presents a new class of locally optimal algorithms called "Joint Equilibrium-based search for policies (JESP)". We first describe an exhaustive version of JESP and subsequently a novel dynamic programming approach to JESP. Our complexity analysis reveals the potential for exponential speedups due to the dynamic programming approach. These theoretical results are verified via empirical comparisons of the two JESP versions with each other and with a globally optimal brute-force search algorithm. Finally, we prove piece-wise linear and convexity (PWLC) properties, thus taking steps towards developing algorithms for continuous belief states.
AB - The problem of deriving joint policies for a group of agents that maximize some joint reward function can be modeled as a decentralized partially observable Markov decision process (POMDP). Yet, despite the growing importance and applications of decentralized POMDP models in the multiagents arena, few algorithms have been developed for efficiently deriving joint policies for these models. This paper presents a new class of locally optimal algorithms called "Joint Equilibrium-based search for policies (JESP)". We first describe an exhaustive version of JESP and subsequently a novel dynamic programming approach to JESP. Our complexity analysis reveals the potential for exponential speedups due to the dynamic programming approach. These theoretical results are verified via empirical comparisons of the two JESP versions with each other and with a globally optimal brute-force search algorithm. Finally, we prove piece-wise linear and convexity (PWLC) properties, thus taking steps towards developing algorithms for continuous belief states.
UR - http://www.scopus.com/inward/record.url?scp=84880823326&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84880823326&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84880823326
SP - 705
EP - 711
JO - IJCAI International Joint Conference on Artificial Intelligence
JF - IJCAI International Joint Conference on Artificial Intelligence
SN - 1045-0823
Y2 - 9 August 2003 through 15 August 2003
ER -