AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Simulation Framework for Supervised Reinforcement Learning

Translated title

Simulations Framework for Superviseret Reinforcement Learning

Authors

;

Term

4. term

Education

Publication year

2012

Submitted on

Pages

76

Abstract

This thesis investigates MDP-based reinforcement learning for use in real-world learning environments, motivated by a previously developed robot platform that tracks LEGO robots with a Microsoft Kinect and exhibits motion uncertainty. We survey and test dynamic programming, Monte Carlo, and temporal-difference methods alongside generalization techniques and supervised learning. We build a simulator-driven learning framework and run experiments in a toy cat-and-mouse game. Within the framework, a supervisor is used to guide learning, and radial basis functions approximate value functions to enable generalization over large state spaces. Experiments indicate that supervision can accelerate learning and remains beneficial even when the supervisor and the trained agent are trained in different environments, and that radial basis functions support value approximation and generalization. Taken together, the work demonstrates a practical simulation framework for supervised reinforcement learning with function approximation, laying groundwork for deployment on the physical robot platform.

Dette speciale undersøger MDP-baseret forstærkningslæring med fokus på anvendelse i virkelige læringsmiljøer, motiveret af en tidligere udviklet robotplatform, hvor LEGO-robotter spores med Microsoft Kinect og er udsat for bevægelsesusikkerhed. Vi gennemgår og afprøver dynamisk programmering, Monte Carlo- og temporaldifferensmetoder samt generaliseringsteknikker og supervision. Vi udvikler en simuleringsbaseret læringsramme og udfører eksperimenter i et legetøjsagtigt kat-og-mus-spil. I rammeværket anvendes en supervisor til at guide læringen, og radiale basisfunktioner bruges til at approksimere værdifunktioner og muliggøre generalisering i store tilstandsrum. Eksperimenterne viser, at en supervisor kan accelerere læringen og er nyttig, selv når supervisor og trænet agent er trænet i forskellige miljøer, samt at radiale basisfunktioner understøtter værdifunktionsapproksimation og generalisering. Samlet demonstrerer arbejdet en praktisk simuleringsramme for superviseret forstærkningslæring med funktionsapproksimation, som baner vejen for senere afprøvning på den fysiske robotplatform.

[This apstract has been generated with the help of AI directly from the project full text]