Solving Complex Problems with Deep Multi-Level Skill Hierarchies
Student thesis: Master thesis (including HD thesis)
- Tobias Lambek Jacobsen
- Nicolaj Casanova Abildgaard
4. term, Software, Master (Master Programme)
The notion of using pre-trained skills to reduce training time and to facilitate lifelong learning in Deep
Reinforcement Learning (DRL) has been around for a long time. However, the number of skills required to
work in an environment goes up as the amount of tasks in the environment increases. As a consequence, the
complexity of the action space increases and agents will need to train for longer in order to conquer all tasks.
In this paper we propose a framework for Deep Multi-Level Skill Hierarchies (D-MuLSH) as a solution to this
problem. This framework is an extended version of the Hierarchical Deep Reinforcement Learning Network
(H-DRLN) that adds the ability to arrange the skill hierarchy with multiple levels. Simple skills are grouped
into complex categories, by use of pre-trained Major Skill Networks (MSN), and agents only need to learn
when to use each category, rather than learn when to use each individual skill. We show that D-MuLSH
improves training time in the ViZDoom environment compared to the H-DRLN.
Reinforcement Learning (DRL) has been around for a long time. However, the number of skills required to
work in an environment goes up as the amount of tasks in the environment increases. As a consequence, the
complexity of the action space increases and agents will need to train for longer in order to conquer all tasks.
In this paper we propose a framework for Deep Multi-Level Skill Hierarchies (D-MuLSH) as a solution to this
problem. This framework is an extended version of the Hierarchical Deep Reinforcement Learning Network
(H-DRLN) that adds the ability to arrange the skill hierarchy with multiple levels. Simple skills are grouped
into complex categories, by use of pre-trained Major Skill Networks (MSN), and agents only need to learn
when to use each category, rather than learn when to use each individual skill. We show that D-MuLSH
improves training time in the ViZDoom environment compared to the H-DRLN.
Language | English |
---|---|
Publication date | 4 Jun 2021 |
Number of pages | 9 |