Reinforcement learning is applicable to complex or unknown problems because the solution search process is done by trial-and-error. However, the calculation time for the trial-and-error search becomes larger as the scale of the problem increases. Therefore, in order to decrease calculation time, some methods have been proposed using the prior information on the problem. This paper improves a previously proposed method utilizing options as prior information. In order to increase the learning speed even with wrong options, methods for option correction by forgetting the policy and extending initiation sets are proposed.
|Number of pages||10|
|Journal||Journal of Advanced Computational Intelligence and Intelligent Informatics|
|Publication status||Published - Jan 1 2013|
All Science Journal Classification (ASJC) codes
- Human-Computer Interaction
- Computer Vision and Pattern Recognition
- Artificial Intelligence