• Jevgenij Martinkevic
4. semester, Medialogi, Kandidat (Kandidatuddannelse)
It has been recently demonstrated how Reinforcement Learning method can been used to solve a range of different locomotion, navigation and robotics tasks as well as reach exceptional performance in a number of games of a varying complexity. However, it has also been proven to be an extremely complicated and time consuming task to convert the goal of the problem into a reward signal. In this thesis, a Curriculum-based Reinforcement Learning approach is investigated and applied to solve a navigation task using only sparse reward signal. The training process is performed in a simulated learning environment built within Unity Engine, while the agents are trained with Proximal Policy Optimization method implemented with Unity Machine Learning Agents Toolkit. The results show that the target task could not be solved by the typical reinforcement learning agent using only sparse reward signal within the given time. However, the agent trained with an environment-centered curriculum, where the task is deconstructed and introduced to the agent in lessons of increasing difficulty, managed to solve the target task and reach success rate of 99%. Furthermore, a combined application of curriculum learning and reward shaping is investigated. It is observed, that this can negatively affect the training process if the two approaches are overlapping by encouraging the same type of behavior.
Udgivelsesdato31 aug. 2018
Antal sider46
ID: 286116734