A study on use of prior information for acceleration of reinforcement learning

Kento Terashima, Junichi Murata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Reinforcement learning is a method with which an agent learns appropriate response for solving problems by trial-and-error. The advantage is that reinforcement learning can be applied to unknown or uncertain problems. But instead, there is a drawback that this method needs a long time to solve the problem because of trial-and-error. If there is prior information about the environment, some of trial-and-error can be spared and the learning can take a shorter time. The prior information provided by a human designer can be wrong because of uncertainties in the problems. If the wrong prior information is used, there can be bad effects such as failure to get the optimal policy and slowing down of reinforcement learning. We propose to control use of the prior information to suppress the bad effects. The agent forgets the prior information gradually by multiplying a forgetting factor while it learns the better policy. We apply the proposed method to a couple of testbed environments and a number of types of prior information. The method shows the good results in terms of both the learning speed and the quality of obtained policies.

Original languageEnglish
Title of host publicationSICE 2011 - SICE Annual Conference 2011, Final Program and Abstracts
PublisherSociety of Instrument and Control Engineers (SICE)
Pages537-543
Number of pages7
ISBN (Print)9784907764395
Publication statusPublished - Jan 1 2011
Event50th Annual Conference on Society of Instrument and Control Engineers, SICE 2011 - Tokyo, Japan
Duration: Sep 13 2011Sep 18 2011

Publication series

NameProceedings of the SICE Annual Conference

Other

Other50th Annual Conference on Society of Instrument and Control Engineers, SICE 2011
CountryJapan
CityTokyo
Period9/13/119/18/11

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'A study on use of prior information for acceleration of reinforcement learning'. Together they form a unique fingerprint.

  • Cite this

    Terashima, K., & Murata, J. (2011). A study on use of prior information for acceleration of reinforcement learning. In SICE 2011 - SICE Annual Conference 2011, Final Program and Abstracts (pp. 537-543). [6060724] (Proceedings of the SICE Annual Conference). Society of Instrument and Control Engineers (SICE).