AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Combining Relational and Hierarchical Reinforcement Learning

Author

Term

4. term

Publication year

2005

Abstract

Reinforcement learning handler om at lære en agent at opføre sig optimalt i et miljø ved at belønne gode handlinger og straffe dårlige. Når problemområder bliver større, bliver det afgørende, hvordan man repræsenterer både domænet og løsningen. I mange virkelige scenarier kan man ikke beskrive alle mulige situationer i en enkel, tabel-baseret form. Dette arbejde undersøger to tilgange til at håndtere det. Den første er relationel reinforcement learning, som kombinerer reinforcement learning med induktiv logikprogrammering (en metode til at lære logiske regler fra data) for at skabe tilstandsabstraktioner og bedre generalisering. Den anden er hierarkisk reinforcement learning ved hjælp af MAXQ-opdeling af værdifunktionen, som muliggør tilstandsabstraktion ved at opdele den primære opgave i mindre delopgaver. Dernæst undersøges muligheden for at kombinere disse to metoder. Resultatet er en generel tilgang, der drager fordel af både induktiv logikprogrammering og hierarkisk opdeling.

Reinforcement learning teaches an agent to act optimally in an environment by rewarding good actions and penalizing poor ones. As problem domains grow, how we represent the domain and its solution becomes critical. In many real-world settings, we cannot list every possible situation in a simple, table-based form. This work examines two ways to address that. The first is relational reinforcement learning, which combines reinforcement learning with inductive logic programming (a way to learn logical rules from data) to create state abstractions and improve generalization. The second is hierarchical reinforcement learning using the MAXQ value function decomposition, which enables state abstractions by breaking the primary task into smaller sub-tasks. We then explore combining these two methods. The result is a general approach that benefits from both inductive logic programming and hierarchical decomposition.

[This abstract was generated with the help of AI]