Novelty-organizing team of classifiers - A team-individual multi-objective approach to reinforcement learning

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In reinforcement learning, there are basically two spaces to search: value-function space and policy space. Consequently, there are two fitness functions each with their associated trade-offs. However, the problem is still perceived as a single-objective one. Here a multi-objective reinforcement learning algorithm is proposed with a structured novelty map population evolving feedforward neural models. It outperforms a gradient based continuous input-output state-of-art algorithm in two problems. Contrary to the gradient based algorithm, the proposed one solves both problems with the same parameters and smaller variance of results. Moreover, the results are comparable even with other discrete action algorithms of the literature as well as neuroevolution methods such as NEAT. The proposed method brings also the novelty map population concept, i.e., a novelty map-based population which is less sensitive to the input distribution and therefore more suitable to create the state space. In fact, the novelty map framework is shown to be less dynamic and more resource efficient than variants of the self-organizing map.

Original languageEnglish
Title of host publicationProceedings of the SICE Annual Conference
PublisherSociety of Instrument and Control Engineers (SICE)
Pages1785-1792
Number of pages8
ISBN (Electronic)9784907764463
DOIs
Publication statusPublished - Oct 23 2014
Event2014 53rd Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2014 - Sapporo, Japan
Duration: Sep 9 2014Sep 12 2014

Publication series

NameProceedings of the SICE Annual Conference

Other

Other2014 53rd Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2014
CountryJapan
CitySapporo
Period9/9/149/12/14

Fingerprint

Reinforcement learning
Classifiers
Self organizing maps
Learning algorithms

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

Vargas, D. V., Takano, H., & Murata, J. (2014). Novelty-organizing team of classifiers - A team-individual multi-objective approach to reinforcement learning. In Proceedings of the SICE Annual Conference (pp. 1785-1792). [6935299] (Proceedings of the SICE Annual Conference). Society of Instrument and Control Engineers (SICE). https://doi.org/10.1109/SICE.2014.6935299

Novelty-organizing team of classifiers - A team-individual multi-objective approach to reinforcement learning. / Vargas, Danilo Vasconcellos; Takano, Hirotaka; Murata, Junichi.

Proceedings of the SICE Annual Conference. Society of Instrument and Control Engineers (SICE), 2014. p. 1785-1792 6935299 (Proceedings of the SICE Annual Conference).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Vargas, DV, Takano, H & Murata, J 2014, Novelty-organizing team of classifiers - A team-individual multi-objective approach to reinforcement learning. in Proceedings of the SICE Annual Conference., 6935299, Proceedings of the SICE Annual Conference, Society of Instrument and Control Engineers (SICE), pp. 1785-1792, 2014 53rd Annual Conference of the Society of Instrument and Control Engineers of Japan, SICE 2014, Sapporo, Japan, 9/9/14. https://doi.org/10.1109/SICE.2014.6935299
Vargas DV, Takano H, Murata J. Novelty-organizing team of classifiers - A team-individual multi-objective approach to reinforcement learning. In Proceedings of the SICE Annual Conference. Society of Instrument and Control Engineers (SICE). 2014. p. 1785-1792. 6935299. (Proceedings of the SICE Annual Conference). https://doi.org/10.1109/SICE.2014.6935299
Vargas, Danilo Vasconcellos ; Takano, Hirotaka ; Murata, Junichi. / Novelty-organizing team of classifiers - A team-individual multi-objective approach to reinforcement learning. Proceedings of the SICE Annual Conference. Society of Instrument and Control Engineers (SICE), 2014. pp. 1785-1792 (Proceedings of the SICE Annual Conference).
@inproceedings{0d79dec4eb06486c901e5a5d57a3b1bf,
title = "Novelty-organizing team of classifiers - A team-individual multi-objective approach to reinforcement learning",
abstract = "In reinforcement learning, there are basically two spaces to search: value-function space and policy space. Consequently, there are two fitness functions each with their associated trade-offs. However, the problem is still perceived as a single-objective one. Here a multi-objective reinforcement learning algorithm is proposed with a structured novelty map population evolving feedforward neural models. It outperforms a gradient based continuous input-output state-of-art algorithm in two problems. Contrary to the gradient based algorithm, the proposed one solves both problems with the same parameters and smaller variance of results. Moreover, the results are comparable even with other discrete action algorithms of the literature as well as neuroevolution methods such as NEAT. The proposed method brings also the novelty map population concept, i.e., a novelty map-based population which is less sensitive to the input distribution and therefore more suitable to create the state space. In fact, the novelty map framework is shown to be less dynamic and more resource efficient than variants of the self-organizing map.",
author = "Vargas, {Danilo Vasconcellos} and Hirotaka Takano and Junichi Murata",
year = "2014",
month = "10",
day = "23",
doi = "10.1109/SICE.2014.6935299",
language = "English",
series = "Proceedings of the SICE Annual Conference",
publisher = "Society of Instrument and Control Engineers (SICE)",
pages = "1785--1792",
booktitle = "Proceedings of the SICE Annual Conference",

}

TY - GEN

T1 - Novelty-organizing team of classifiers - A team-individual multi-objective approach to reinforcement learning

AU - Vargas, Danilo Vasconcellos

AU - Takano, Hirotaka

AU - Murata, Junichi

PY - 2014/10/23

Y1 - 2014/10/23

N2 - In reinforcement learning, there are basically two spaces to search: value-function space and policy space. Consequently, there are two fitness functions each with their associated trade-offs. However, the problem is still perceived as a single-objective one. Here a multi-objective reinforcement learning algorithm is proposed with a structured novelty map population evolving feedforward neural models. It outperforms a gradient based continuous input-output state-of-art algorithm in two problems. Contrary to the gradient based algorithm, the proposed one solves both problems with the same parameters and smaller variance of results. Moreover, the results are comparable even with other discrete action algorithms of the literature as well as neuroevolution methods such as NEAT. The proposed method brings also the novelty map population concept, i.e., a novelty map-based population which is less sensitive to the input distribution and therefore more suitable to create the state space. In fact, the novelty map framework is shown to be less dynamic and more resource efficient than variants of the self-organizing map.

AB - In reinforcement learning, there are basically two spaces to search: value-function space and policy space. Consequently, there are two fitness functions each with their associated trade-offs. However, the problem is still perceived as a single-objective one. Here a multi-objective reinforcement learning algorithm is proposed with a structured novelty map population evolving feedforward neural models. It outperforms a gradient based continuous input-output state-of-art algorithm in two problems. Contrary to the gradient based algorithm, the proposed one solves both problems with the same parameters and smaller variance of results. Moreover, the results are comparable even with other discrete action algorithms of the literature as well as neuroevolution methods such as NEAT. The proposed method brings also the novelty map population concept, i.e., a novelty map-based population which is less sensitive to the input distribution and therefore more suitable to create the state space. In fact, the novelty map framework is shown to be less dynamic and more resource efficient than variants of the self-organizing map.

UR - http://www.scopus.com/inward/record.url?scp=84911932820&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84911932820&partnerID=8YFLogxK

U2 - 10.1109/SICE.2014.6935299

DO - 10.1109/SICE.2014.6935299

M3 - Conference contribution

AN - SCOPUS:84911932820

T3 - Proceedings of the SICE Annual Conference

SP - 1785

EP - 1792

BT - Proceedings of the SICE Annual Conference

PB - Society of Instrument and Control Engineers (SICE)

ER -