Reinforcement learning by GA using importance sampling

Chikao Tsuchiya, Hajime Kimura, Jun Sakuma, Shigenobu Kobayashi

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Reinforcement Learning (RL) handles policy search problems: searching a mapping from state space to action space. However RL is based on gradient methods and as such, cannot deal with problems with multimodal landscape. In contrast, though Genetic Algorithm (GA) is promising to deal with them, it seems to be unsuitable for policy search problems from the viewpoint of the cost of evaluation. Minimal Generation Gap (MGG), used as a generation-alternation model in GA, generates many offspring from two or more parents selected from a population. Therefore, evaluating policies of generated offspring requires much trial and error (i.e. interaction between an agent and an environment). In this paper, we incorporate importance sampling into the framework of MGG in order to reduce the cost of evaluation on policy search. The proposed techniques are applied to Markov Decision Process (MDP) with multimodal landscape. The experimental results show that these techniques can reduce the number of interaction between an agent and an environment, and also mean that MGG and importance sampling are good for each other.

Original languageEnglish
Pages (from-to)1-10
Number of pages10
JournalTransactions of the Japanese Society for Artificial Intelligence
Volume20
Issue number1
DOIs
Publication statusPublished - May 24 2005

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Reinforcement learning by GA using importance sampling'. Together they form a unique fingerprint.

  • Cite this