Where can I go? Deep multi-modal scene understanding for outdoor navigation

Author

Humblot-Renaux, Galadrielle Eve Giséle Elisabeth

Term

4. semester

Education

Robotics, M.Sc.

Publication year

2021

Submitted on

2021-06-03

Pages

150

Abstract

This project delves into deep learning-based computer vision for scene understanding in the context of autonomous outdoor navigation. Rather than relying on specific scene-dependent semantic categories, we take an affordance-based approach, proposing to parse egocentric images in terms of how a vehicle or robot can drive in them. We use a SegNet-based image segmentation network as our building block for classifying pixels into 3 driveability levels, and explore soft labelling, pixel-wise loss weighting, and deep adaptive fusion schemes to penalize severe mistakes during learning, improve segmentation in regions of interest, and incorporate infrared and depth data into the prediction. The proposed training schemes and multi-modal architecture are evaluated on 9 public datasets, showing promising results across unstructured forested environments, urban driving scenes, and multi-view hand-held captures.

Keywords

computer vision ; robotics ; perception ; multi-modal ; semantic segmantation ; deep learnin ; convolutional neural network ; driveability ; navigation ; infrared ; fusion ; thermal ; depth ; affordance ; classification ; scene understanding

Documents

Download
View record in AAU Student Projects

A master's thesis from Aalborg University

Where can I go? Deep multi-modal scene understanding for outdoor navigation