AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


SKYWALKER - Autonomous Control of a Free-Floating Space Manipulator in Simulated Microgravity Using Reinforcement Learning

Authors

; ;

Term

4. semester

Education

Publication year

2025

Submitted on

Pages

83

Abstract

Dette projekt undersøger, om Proximal Policy Optimization (PPO) – en metode inden for deep reinforcement learning (en gren af maskinlæring, hvor styring læres gennem forsøg og fejl) – kan styre en robotarm på en fritsvævende base i et simuleret mikrogravitationsmiljø. Vi opbyggede en forenklet udgave af ESA’s Orbital Robotics Laboratory i Isaac Lab med en robotarm, faste gribepunkter og et friktionsfrit gulv for at efterligne vægtløshed. Styringen blev trænet med curriculum learning, dvs. at man starter med enkle opgaver og øger sværhedsgraden trinvis. Systemet blev evalueret med strukturerede accepttests: punkt-til-punkt-bevægelse, gribning, flytning af basen og en flertrins traversering. PPO gav glatte og præcise bevægelser uden foruddefinerede baner i de tidlige opgaver, men løste ikke den sidste traversering, hvilket understreger udfordringerne ved planlægning mange skridt frem. På trods af denne begrænsning demonstrerer arbejdet en funktionsdygtig kontrolpipeline baseret på forstærkningslæring og et valideret simulationssetup og lægger grundlaget for fremtidige test på fysiske platforme. Samlet set giver resultaterne et proof-of-concept for at bruge deep reinforcement learning til autonom manipulation og bevægelse under rumlignende forhold.

This project investigates whether Proximal Policy Optimization (PPO)—a deep reinforcement learning method that learns control through trial and error—can control a robotic arm on a free-floating base in a simulated microgravity environment. We built a simplified version of ESA’s Orbital Robotics Laboratory in Isaac Lab, including a robot arm, fixed grasp points, and a frictionless floor to mimic weightlessness. The controller was trained with curriculum learning, starting from simple tasks and gradually increasing difficulty. We evaluated performance with structured acceptance tests: point-to-point motion, grasping, base relocation, and a multi-step traversal. PPO produced smooth and accurate motions without predefined paths in the early tasks, but it did not solve the final traversal, highlighting the challenge of planning many steps ahead. Despite this limitation, the work demonstrates a functional reinforcement learning control pipeline and a validated simulation setup, laying the groundwork for future tests on physical platforms. Overall, the results provide a proof of concept for using deep reinforcement learning to enable autonomous manipulation and movement under space-like conditions.

[This summary has been rewritten with the help of AI based on the project's original abstract]