AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Learning Action Primitives From 3D Stereo Vision Measurements

Author

Term

4. term

Publication year

2010

Abstract

At forstå, hvordan objekter bevæger sig i billeder, er centralt i computervision. Mange eksisterende modeller behandler hver bevægelsesbane (trajektorie) for sig, selv når banerne deler fælles afsnit, hvilket giver redundant og ukorreleret data. Disse fælles afsnit er aktionsprimitiver—grundlæggende byggesten for bevægelse. Denne afhandling præsenterer en ramme, der sporer objekter visuelt ved hjælp af farvesegmentering (opdeling af billedet efter farver) og skjulte Markov-modeller (sandsynlighedsmodeller for sekvenser), registrerer bevægelsesbaner, identificerer aktionsprimitiver og bygger en samlet model, der repræsenterer forskellige baner i fællesskab og effektivt. Modellen over aktionsprimitiver kan bruges som læringsmodel for en robot ved at give en mere overordnet beskrivelse af de udførte handlinger. Rammen er implementeret som et C++-programmeringsbibliotek med en fuld løsning til objektdetektion, sporing, bevægelsesregistrering og modelopbygning.

Understanding how objects move in images is central to computer vision. Many existing models treat each motion trajectory separately, even when trajectories share common segments, which leads to redundant and unconnected data. These shared segments are action primitives—basic building blocks of movement. This thesis presents a framework that visually tracks objects using color segmentation (separating image regions by color) and Hidden Markov Models (probabilistic models for sequences), records motion trajectories, identifies action primitives, and builds a single combined model that represents different trajectories jointly and efficiently. The action-primitives model can be used for robot learning by providing higher-level descriptions of the actions performed. The framework is implemented as a C++ programming library, with a complete system for object detection, tracking, motion recording, and model building.

[This abstract was generated with the help of AI]