Selected article from IEEE Transactions on Cognitive and Developmental Systems
Reinforcement learning (RL) problems are hard to solve in a robotics context as classical algorithms rely on discrete representations of actions and states, but in robotics both are continuous. It is proposed to define a process to make a robot build its own representation for an RL algorithm. The principle is to first use a direct policy search in the sensori-motor space, i.e., with no predefined discrete sets of states nor actions, and then extract from the corresponding learning traces discrete actions and identify the relevant dimensions of the state to estimate the value function. Once this is done, the robot can apply RL: 1) to be more robust to new domains and, if required and 2) to learn faster than a direct policy search. This approach allows to take the best of both worlds: first learning in a continuous space to avoid the need of a specific representation, but at a price of a long learning process and a poor generalization, and then learning with an adapted representation to be faster and more robust.
IEEE Transactions on Cognitive and Developmental Systems, Mar. 2018