Abstract:
Until recently it was widely considered that value function-based reinforcement learning methods were the only feasible way of solving general stochastic optimal control problems. Unfortunately, these approaches are inapplicable to real-world problems with continuous, high-dimensional and partially-observable properties such as motor control tasks.
While policy-gradient reinforcement learning methods suggest a suitable approach to such tasks, they suffer from typical parametric learning issues such as model selection and catastrophic forgetting. This thesis investigates the application of policy-gradient learning to a range of simulated motor learning tasks and introduces the use of local factored policies to enable incremental learning in tasks of unknown complexity.