AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Hierarchical Reinforcement Learning in Multi-Agent Environment

Authors

;

Term

4. term

Publication year

2005

Abstract

Denne afhandling undersøger hierarkisk forstærkningslæring, hvor en algoritme lærer gennem forsøg og fejl og opdeler komplekse opgaver i mindre delopgaver. Vi beskriver MaxQ value function decomposition i detaljer. Med MaxQ kan tilstandsrummet (alle mulige situationer) reduceres betydeligt, hvilket kan gøre læringen mere håndterbar. For at underbygge, at MaxQ klarer sig bedre end en basal forstærkningslæringsalgoritme, gennemfører vi en sammenligning. Resultaterne viser tydeligt, at jo mere komplekst problemet bliver, desto større bliver forskellen i ydeevne til fordel for MaxQ. Standard-MaxQ tillader dog ikke, at flere agenter samarbejder. Vi præsenterer derfor en udvidelse, der gør det muligt for agenter med samme opgavenedbrydning at koordinere og samarbejde på et højt abstraktionsniveau. Vi viser, at to agenter med denne nye algoritme opnår bedre resultater end to agenter med grundlæggende MaxQ. For yderligere at udforske multiagent-forstærkningslæring i heterogene miljøer (hvor agenter eller deres opgaver kan være forskellige), foreslår vi to tilgange: erfaringsdeling for at accelerere læringen og en udvidelse af den hierarkiske multiagent-metode, så agenter med forskellige opgavenedbrydninger kan samarbejde.

This thesis explores hierarchical reinforcement learning, where an algorithm learns by trial and error and breaks complex tasks into smaller subtasks. We describe the MaxQ value function decomposition in detail. With MaxQ, the state space (the set of all possible situations) can be reduced considerably, which can make learning more manageable. To support the claim that MaxQ outperforms a basic reinforcement learning algorithm, we run a comparison. The results clearly show that as problem complexity increases, the performance gap grows in favor of MaxQ. However, standard MaxQ does not allow multiple agents to cooperate. We therefore introduce an extension that lets agents with the same task decomposition coordinate and cooperate at a high level of abstraction. We show that two agents using this new algorithm achieve better results than two agents using basic MaxQ. To further study multi-agent reinforcement learning in heterogeneous settings (where agents or their tasks may differ), we propose two approaches: sharing experience to speed up learning, and expanding the hierarchical multi-agent method so agents with different task decompositions can cooperate.

[This abstract was generated with the help of AI]