TY - GEN
T1 - A method for finding multiple subgoals for reinforcement learning
AU - Ogihara, Fuminori
AU - Murata, Junichi
PY - 2011/12/1
Y1 - 2011/12/1
N2 - This paper proposes a new method for discovering multiple subgoals automatically to accelerate reinforcement learning. There have been proposed several methods for discovery of subgoals. Some use state visiting frequencies in the trajectories that reach the goal state. When a state visiting frequency is very high, this state is regarded as the subgoal. Because this kind of methods need that the goal state is reached many times to collect trajectories, they take a long time for discovering subgoals. In addition, they cannot discover the potential subgoals that will become appropriate subgoals when the goal state changes. On the other hand, some methods identify subgoals by partitioning local state transition graphs. But this kind of methods require large calculation amounts. We propose a new method that solves the above drawbacks. The new method utilizes state visiting frequencies. But we collect trajectories that go through particular non-goal states selected at random. For each particular state, trajectories are collected. Most of the trajectories reach the particular state more easily that the goal state. Therefore, it is expected that we can discover subgoals quickly and discover multiple subgoals together.
AB - This paper proposes a new method for discovering multiple subgoals automatically to accelerate reinforcement learning. There have been proposed several methods for discovery of subgoals. Some use state visiting frequencies in the trajectories that reach the goal state. When a state visiting frequency is very high, this state is regarded as the subgoal. Because this kind of methods need that the goal state is reached many times to collect trajectories, they take a long time for discovering subgoals. In addition, they cannot discover the potential subgoals that will become appropriate subgoals when the goal state changes. On the other hand, some methods identify subgoals by partitioning local state transition graphs. But this kind of methods require large calculation amounts. We propose a new method that solves the above drawbacks. The new method utilizes state visiting frequencies. But we collect trajectories that go through particular non-goal states selected at random. For each particular state, trajectories are collected. Most of the trajectories reach the particular state more easily that the goal state. Therefore, it is expected that we can discover subgoals quickly and discover multiple subgoals together.
UR - http://www.scopus.com/inward/record.url?scp=84866718858&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84866718858&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84866718858
SN - 9784990288051
T3 - Proceedings of the 16th International Symposium on Artificial Life and Robotics, AROB 16th'11
SP - 804
EP - 807
BT - Proceedings of the 16th International Symposium on Artificial Life and Robotics, AROB 16th'11
T2 - 16th International Symposium on Artificial Life and Robotics, AROB '11
Y2 - 27 January 2011 through 29 January 2011
ER -