AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Low Level Robot Control Using A Multi Modal Foundation Model: P10 project

Translated title

Low Level Robot Control Using A Multi Modal Foundation Model

Term

4. semester

Education

Publication year

2024

Submitted on

Pages

69

Abstract

Robotic systems are often highly specialized, with little flexibility for different tasks. In this report, we outline our work on implementing our own robotic control stack in our pursuit to experiment on Octo, a multi-modal foundation model, for low-level control of a robotic manipulator. Octo is designed for flexibility, capable of running on various robotic hardware and performing a wide range of tasks. We fine-tuned Octo on our own data, recorded using tools developed for this project. This data is in a standardized format for future use in training robotic systems. To train and run Octo, we created a custom robot environment, integrated it with a Polymetis server wrapped in a ZeroRPC server, developed a VR control system for intuitive robot control, and built our own data recording tools. We modified existing Octo scripts to fit our use case, successfully fine-tuning and running Octo in our custom environment. Our model was trained to use two camera inputs and a task description to pick up an arbitrary object