Hierarchical Reinforcement Learning in Multi-Agent Environment

Student thesis: Master thesis (including HD thesis)

  • Dennis Kjærulff Pedersen
  • Tim Boesen
4. term, Computer Science, Master (Master Programme)
The purpose of this report is to explore the area of Hierarchical Reinforcement Learning. First a hierarchical reinforcement approached called the \emph{MaxQ value function decomposition} is described in great detail. Using MaxQ the state space can be reduced considerably. To support the claim that MaxQ performs better than the basic reinforcement learning algorithm, a test comparing the two is performed. The results clearly show that as the complexity grows, so does the difference in performance between the two.
The MaxQ algorithm does not allow agents to cooperate. An extension to MaxQ is presented that allow agents with the same task decomposition to coordinate and cooperate at a high level of abstraction. We show that two agents using the new algorithm does in fact deliver better results than two agents using the basic MaxQ value function decomposition.
To further explore the area of multi-agent reinforcement learning, we propose two approaches that deals with heterogeneity in multi-agent environment. The first approach uses experience sharing to speed up learning, while the other expands the multi-agent hierarchical algorithm to allow agents with different task decompositions to cooperate.
Publication dateJun 2005
ID: 61065338