A study on abstract policy for acceleration of reinforcement learning

Ahmad Afif Mohd Faudzi, Hirotaka Takano, Junichi Murata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Reinforcement learning (RL) is well known as one of the methods that can be applied to unknown problems. However, because optimization at every state requires trial-and-error, the learning time becomes large when environment has many states. If there exist solutions to similar problems and they are used during the exploration, some of trial-anderror can be spared and the learning can take a shorter time. In this paper, the authors propose to reuse an abstract policy, a representative of a solution constructed by learning vector quantization (LVQ) algorithm, to improve initial performance of an RL learner in a similar but different problem. Furthermore, it is investigated whether or not the policy can adapt to a new environment while preserving its performance in the old environments. Simulations show good result in terms of the learning acceleration and the adaptation of abstract policy.

Original languageEnglish
Title of host publicationProceedings of the SICE Annual Conference
PublisherSociety of Instrument and Control Engineers (SICE)
Pages1793-1798
Number of pages6
ISBN (Electronic)9784907764463
DOIs
Publication statusPublished - Oct 23 2014
Event2014 53rd Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2014 - Sapporo, Japan
Duration: Sep 9 2014Sep 12 2014

Publication series

NameProceedings of the SICE Annual Conference

Other

Other2014 53rd Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2014
CountryJapan
CitySapporo
Period9/9/149/12/14

Fingerprint

Reinforcement learning
Vector quantization

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

Faudzi, A. A. M., Takano, H., & Murata, J. (2014). A study on abstract policy for acceleration of reinforcement learning. In Proceedings of the SICE Annual Conference (pp. 1793-1798). [6935300] (Proceedings of the SICE Annual Conference). Society of Instrument and Control Engineers (SICE). https://doi.org/10.1109/SICE.2014.6935300

A study on abstract policy for acceleration of reinforcement learning. / Faudzi, Ahmad Afif Mohd; Takano, Hirotaka; Murata, Junichi.

Proceedings of the SICE Annual Conference. Society of Instrument and Control Engineers (SICE), 2014. p. 1793-1798 6935300 (Proceedings of the SICE Annual Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Faudzi, AAM, Takano, H & Murata, J 2014, A study on abstract policy for acceleration of reinforcement learning. in Proceedings of the SICE Annual Conference., 6935300, Proceedings of the SICE Annual Conference, Society of Instrument and Control Engineers (SICE), pp. 1793-1798, 2014 53rd Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2014, Sapporo, Japan, 9/9/14. https://doi.org/10.1109/SICE.2014.6935300
Faudzi AAM, Takano H, Murata J. A study on abstract policy for acceleration of reinforcement learning. In Proceedings of the SICE Annual Conference. Society of Instrument and Control Engineers (SICE). 2014. p. 1793-1798. 6935300. (Proceedings of the SICE Annual Conference). https://doi.org/10.1109/SICE.2014.6935300
Faudzi, Ahmad Afif Mohd ; Takano, Hirotaka ; Murata, Junichi. / A study on abstract policy for acceleration of reinforcement learning. Proceedings of the SICE Annual Conference. Society of Instrument and Control Engineers (SICE), 2014. pp. 1793-1798 (Proceedings of the SICE Annual Conference).
@inproceedings{4e304afa14714fcba6c76bca84448d50,
title = "A study on abstract policy for acceleration of reinforcement learning",
abstract = "Reinforcement learning (RL) is well known as one of the methods that can be applied to unknown problems. However, because optimization at every state requires trial-and-error, the learning time becomes large when environment has many states. If there exist solutions to similar problems and they are used during the exploration, some of trial-anderror can be spared and the learning can take a shorter time. In this paper, the authors propose to reuse an abstract policy, a representative of a solution constructed by learning vector quantization (LVQ) algorithm, to improve initial performance of an RL learner in a similar but different problem. Furthermore, it is investigated whether or not the policy can adapt to a new environment while preserving its performance in the old environments. Simulations show good result in terms of the learning acceleration and the adaptation of abstract policy.",
author = "Faudzi, {Ahmad Afif Mohd} and Hirotaka Takano and Junichi Murata",
year = "2014",
month = "10",
day = "23",
doi = "10.1109/SICE.2014.6935300",
language = "English",
series = "Proceedings of the SICE Annual Conference",
publisher = "Society of Instrument and Control Engineers (SICE)",
pages = "1793--1798",
booktitle = "Proceedings of the SICE Annual Conference",

}

TY - GEN

T1 - A study on abstract policy for acceleration of reinforcement learning

AU - Faudzi, Ahmad Afif Mohd

AU - Takano, Hirotaka

AU - Murata, Junichi

PY - 2014/10/23

Y1 - 2014/10/23

N2 - Reinforcement learning (RL) is well known as one of the methods that can be applied to unknown problems. However, because optimization at every state requires trial-and-error, the learning time becomes large when environment has many states. If there exist solutions to similar problems and they are used during the exploration, some of trial-anderror can be spared and the learning can take a shorter time. In this paper, the authors propose to reuse an abstract policy, a representative of a solution constructed by learning vector quantization (LVQ) algorithm, to improve initial performance of an RL learner in a similar but different problem. Furthermore, it is investigated whether or not the policy can adapt to a new environment while preserving its performance in the old environments. Simulations show good result in terms of the learning acceleration and the adaptation of abstract policy.

AB - Reinforcement learning (RL) is well known as one of the methods that can be applied to unknown problems. However, because optimization at every state requires trial-and-error, the learning time becomes large when environment has many states. If there exist solutions to similar problems and they are used during the exploration, some of trial-anderror can be spared and the learning can take a shorter time. In this paper, the authors propose to reuse an abstract policy, a representative of a solution constructed by learning vector quantization (LVQ) algorithm, to improve initial performance of an RL learner in a similar but different problem. Furthermore, it is investigated whether or not the policy can adapt to a new environment while preserving its performance in the old environments. Simulations show good result in terms of the learning acceleration and the adaptation of abstract policy.

UR - http://www.scopus.com/inward/record.url?scp=84911908541&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84911908541&partnerID=8YFLogxK

U2 - 10.1109/SICE.2014.6935300

DO - 10.1109/SICE.2014.6935300

M3 - Conference contribution

AN - SCOPUS:84911908541

T3 - Proceedings of the SICE Annual Conference

SP - 1793

EP - 1798

BT - Proceedings of the SICE Annual Conference

PB - Society of Instrument and Control Engineers (SICE)

ER -