AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Copy and paste synthetic datasetgeneration in agriculture: An investigation in reducing the need for manual annotation using generated datasets.

Translated title

Copy and paste synthetic datasetgeneration in agriculture

Authors

;

Term

4. term

Publication year

2021

Pages

55

Abstract

Specialet undersøger, om relativt simple måder at skabe syntetiske datasæt til billedbaseret ukrudtsdetektion kan yde lige så godt som træning på traditionelt, manuelt annoterede data. Arbejdet gennemgår flere mulige datasæt, analyserer strukturen i det udvalgte datasæt og bruger den viden til at skabe flere syntetiske datasæt med metoden Cut, Paste and Learn (objekter klippes ud og indsættes for at danne nye træningseksempler). Det drøfter også kort valget af segmenteringsmodel til afprøvning. Herefter beskrives design og implementering af både datagenerering og segmenteringsmodel. I eksperimenterne blev forskellige blandingsteknikker brugt, når de syntetiske billeder blev sat sammen. Overraskende klarede modeller trænet på helt syntetiske datasæt sig bedre end modeller trænet på det konventionelt annoterede datasæt.

This thesis explores whether relatively simple ways to generate synthetic datasets for image-based weed detection can perform as well as training on traditional, manually annotated data. The study reviews several candidate datasets, examines the structure of the selected dataset, and uses that insight to create multiple synthetic datasets with the Cut, Paste and Learn approach (objects are cut out and pasted to form new training examples). It also briefly considers which segmentation model to use for testing. The thesis then describes how the data generation pipeline and the segmentation model were built. In experiments, different blending techniques were applied when composing the synthetic images. Surprisingly, models trained on fully synthetic datasets outperformed those trained on the conventionally annotated dataset.

[This summary has been rewritten with the help of AI based on the project's original abstract]