A robot needs the abilities of recognizing motion in the world ("other-motion"), and generating "self-motion" to adaptively behave in a real environment. We have been currently developing a system composed of an "other-motion" recognition module and a "self-motion" generation module. This paper focuses on "other-motion" recognition that is based on "self-motion." The recognition and generation modules are each constructed by reinforcement learning and a Hidden Markov Model (HMM). In this case, the HMM estimation needs many sample data of the motion to be learned. However, there is no guarantee that a sufficient amount of motion data can be acquired in the real world, and the reliability of the HMM may therefore be low. In order to solve this problem, this paper presents a new estimation method of an HMM based on the learning results of reinforcement learning. The state value function of the reinforcement learning is divided into some clusters, and each cluster is made to correspond to a state of the HMM. An output distribution can thereby be estimated on the basis of the value of the value function. Some experimental results show that our method can estimate HMM's model parameters not only from few sample data but also from value functions of the generation module, and that the reliability of the estimated HMM can be improved.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Information Systems
- Hardware and Architecture
- Computational Theory and Mathematics