Deriving Subgoals Using Network Distillation
Authors
Ljørring, Nikolaj ; Jensen, Lars Svane ; Mohammadi Landi, Aryan
Term
4. term
Education
Publication year
2021
Submitted on
2021-06-11
Pages
12
Abstract
Environments with sparse rewards make it hard for deep reinforcement learning (trial‑and‑error learning with neural networks) to discover and master good strategies. Hierarchical reinforcement learning can help by breaking tasks into subgoals that are easier for the agent to learn. However, automatically discovering effective subgoals is slow. We therefore propose a new method for finding and constructing subgoals, including a more time‑efficient way to compare candidates for subgoal creation. We also introduce a novel distributed training framework to increase the agent’s throughput (the amount of experience processed over time). The framework indicates increased data gathering but decreased learning compared to a non‑distributed setup.
Miljøer med få belønninger gør det svært for dyb forstærkningslæring (prøve‑og‑fejllæring med neurale netværk) at lære effektive strategier. Hierarkisk forstærkningslæring kan hjælpe ved at opdele opgaver i delmål, som er mere håndterbare for agenten. At finde gode delmål automatisk er dog langsomt. Vi foreslår derfor en ny metode til at finde og konstruere delmål, herunder en mere tidseffektiv måde at sammenligne kandidater på til delmålsoprettelse. Derudover præsenterer vi en ny distribueret træningsramme, der skal øge agentens gennemløb (mængden af erfaring, der behandles over tid). Rammeværket indikerer øget dataindsamling, men mindre læring sammenlignet med en ikke‑distribueret opsætning.
[This apstract has been rewritten with the help of AI based on the project's original abstract]
Keywords
