AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


R-FCN Object Detection Ensemble based on Object Resolution and Image Quality

Author

Term

4. term

Publication year

2017

Abstract

At få computere til at finde og genkende objekter i billeder er udfordrende, fordi objekter kan variere meget både inden for samme kategori (f.eks. størrelse, form, vinkel) og mellem kategorier, og fordi billeder kan variere i kvalitet og lys. Dette arbejde undersøger, hvordan man kan forbedre objektdetektion ved at kombinere flere Region-baserede Fuldt Konvolutionelle Netværk (R-FCN), en type dybdelæringsmodel til objektdetektion. To tilgange blev afprøvet: (1) datasampling og -udvælgelse for at lave forskellige delmængder af træningsdata med mindre variation i objektstørrelse og billedkvalitet, så der kunne trænes specialiserede “ekspert”-modeller; og (2) strategier til at kombinere disse eksperters detektioner til ét samlet resultat. Modellerne blev trænet og testet på benchmark-datasættet PASCAL VOC. Når ekspertmodellerne blev kombineret på en hensigtsmæssig måde, steg Average Precision (AP), et standardmål for nøjagtighed. Metoden viser potentiale, og fremtidigt arbejde kan målrette andre typer objekt- eller billedvariationer for at gøre ensemblet endnu mere robust.

Teaching computers to detect objects in images is challenging because objects can vary widely within a class (e.g., size, shape, pose) and between classes, and because images themselves differ in quality and lighting. This study explores improving object detection by combining multiple Region-based Fully Convolutional Networks (R-FCN), a type of deep learning model for detecting objects in image regions. Two approaches were tested: (1) data sampling and selection to create different training subsets with reduced variation in object size and image quality, enabling specialized “expert” models; and (2) strategies for merging these experts’ detections into a single ensemble output. The models were trained and evaluated on the PASCAL VOC benchmark dataset. When the expert models were combined appropriately, Average Precision (AP)—a standard accuracy measure—increased. The method shows promise, and future work could target other object or image variations to build an even more robust ensemble.

[This abstract was generated with the help of AI]