Qualitative geometric scene understanding using planar surfaces in multiple views

Author

Buhl, Jacob

Term

4. term

Education

Vision, Graphics and Interactive Systems, Master

Publication year

2011

Submitted on

2011-08-17

Pages

146

Abstract

This thesis explores how to automatically build a qualitative geometric understanding of a scene from multiple views, focusing on planar surfaces. Such understanding supports tasks like scene analysis and autonomous robots. Instead of reconstructing a precise 3D model, we infer relationships across frames (e.g., which surfaces are in front and how they are oriented). The system has four modules, each evaluated quantitatively. Pre-processing detects blur and noise in each frame and discards poor frames. Feature extraction finds corresponding keypoints between two frames using SIFT (Scale-Invariant Feature Transform) and KLT (Kanade–Lucas–Tomasi), and refines matches using epipolar geometry (the geometry relating two camera views). Layer extraction identifies planar surfaces in two views with an extended RANSAC (a robust model-fitting method) and improves initial layer segmentation using graph cut. Geometric scene understanding then estimates layer connections, relative depths, and orientations, and combines these features with simple reasoning rules to produce a qualitative geometric description. The complete system is evaluated qualitatively on two videos. Results indicate that accurate layer segmentation is crucial for performance; with reliable segmentation, the system yields good and precise geometric scene understanding. The thesis contributes an extension to RANSAC and a simple, intuitive reasoning algorithm that uses homographies (mappings between views of a plane) and layer segmentations.

Dette speciale undersøger, hvordan man automatisk kan opbygge en kvalitativ geometrisk sceneforståelse ud fra flere visninger med fokus på plane flader. En sådan forståelse er nyttig til blandt andet sceneanalyse og autonome robotter. I stedet for at rekonstruere en præcis 3D-model udleder vi relationer på tværs af billeder (f.eks. hvilke flader der er foran, og hvordan de er orienteret). Systemet består af fire moduler, som hver evalueres kvantitativt. Forbehandling registrerer slør og støj i hvert billede og forkaster dårlige billeder. Feature-udtrækning finder tilsvarende nøglepunkter mellem to billeder med SIFT (Scale-Invariant Feature Transform) og KLT (Kanade–Lucas–Tomasi) og forbedrer matchene via epipolargeometri (geometrien mellem to kameravisninger). Lagudtrækning identificerer plane flader i to visninger med en udvidet RANSAC (en robust metode til modeltilpasning) og forbedrer den første lagsegmentering med grafsnit (graph cut). Geometrisk sceneforståelse estimerer derefter lagforbindelser, relative dybder og orienteringer og kombinerer disse egenskaber med enkle ræsonneringsregler for at danne en kvalitativ geometrisk beskrivelse. Hele systemet evalueres kvalitativt på to videoer. Resultaterne viser, at nøjagtig lagsegmentering er afgørende for ydeevnen; med pålidelig segmentering kan systemet levere en god og præcis geometrisk sceneforståelse. Specialet bidrager med en udvidelse af RANSAC samt en enkel og intuitiv ræsonneringsalgoritme, der bruger homografier (afbildninger mellem visninger af en plan) og lagsegmenteringer.

[This apstract has been rewritten with the help of AI based on the project's original abstract]

Keywords

RANSAC ; Graph cut ; Layer ; planar surface ; scene understanding ; multiple view ; reconstruction ; noise ; blur ; block world ; object orientation

Documents

Download PDF
View record in AAU Student Projects

A master's thesis from Aalborg University

Qualitative geometric scene understanding using planar surfaces in multiple views