3D Volumetric and Semantic Reconstruction of a Robotic Workspace Using Deep Learning: A Proposal for a Symbiotic Human Robot Collaboration System
Translated title
3D Volumetric and Semantic Reconstruction of a Robotic Workspace Using Deep Learning
Author
Mateus Martins, Guilherme
Term
4. semester
Education
Publication year
2021
Submitted on
2021-06-02
Pages
48
Abstract
As robots work more closely with people, they must understand not only shapes and distances but also what things are and how they relate. This thesis presents a proof-of-concept system that reconstructs scenes and adds semantics (meaning and labels) to the robot’s map. The system operates in two stages: an offline stage that maps static objects in the environment, and an online stage that searches for small, dynamic objects by using larger, previously mapped objects as landmarks. Semantics are computed with a two-stage image pipeline: YOLOv4 detects and names objects in images, and DeepLabV3+ segments them by drawing precise outlines. To link static and dynamic objects, the system uses object ontologies, a structured description of object types and their relationships. After both offline and online reconstructions, the system projects object masks into 3D coordinates to place each object in a spatial map. Qualitative and quantitative results indicate the system is robust. As a proof of concept, it shows that static objects can serve as regions of interest to detect dynamic objects based on ontological relations and to infer where specific tasks are likely to occur.
Når robotter skal samarbejde med mennesker, er det ikke nok at kende former og afstande; de skal også forstå, hvad ting er, og hvordan de hænger sammen. Denne afhandling præsenterer et proof-of-concept-system, der rekonstruerer scener og tilføjer semantik (betydning og etiketter) til robotens kort. Systemet arbejder i to trin: et offline-trin, som kortlægger statiske objekter i omgivelserne, og et online-trin, som leder efter små, dynamiske objekter ved at bruge de større, tidligere kortlagte objekter som landemærker. Semantikken beregnes med en to-trins billedbehandlingskæde: YOLOv4 bruges til at finde og navngive objekter i billeder, og DeepLabV3+ bruges til at tegne deres præcise konturer (segmentering). For at forbinde statiske og dynamiske objekter anvendes objektontologier, dvs. en struktureret videnbase over objekttyper og deres indbyrdes relationer. Efter både offline- og online-rekonstruktionen projiceres objekternes masker til 3D-koordinater, så hvert objekt placeres korrekt i et rumligt kort. Både kvalitative og kvantitative resultater viser, at systemet er robust. Som bevis for konceptet demonstrerer systemet, at det kan bruge statiske objekter som interesseområder til at opdage dynamiske objekter på baggrund af deres ontologiske relationer og udlede, hvor specifikke opgaver sandsynligvis vil finde sted.
[This apstract has been rewritten with the help of AI based on the project's original abstract]
Keywords
Deep learning ; AI ; YOLO ; DeepLab ; 3D Reconstruction ; Object ontologies ; GUI ; Human Robot Collaboration ; HRC ; Human Robot Interaction ; HRI ; Symbiotic Human Robot Collaboration ; ROS ; Python ; JSON ; Pytorch ; Darknet
