• Ernest Bofill Ylla
4. semester, Medialogi, Kandidat (Kandidatuddannelse)
Convolutional neural networks are the state of the art in computer vision tasks thanks to breakthrough architecture innovations in the past few years such as the ``Inception’’ architecture. Large datasets of annotated images, necessary to train CNNs are scarce. 3D models can be used to generate synthetic datasets of rendered images in a fast and automated way.

This thesis investigates how amplifying a small dataset of natural images with a much larger dataset of rendered images improves the classification accuracy of an Inception-V3 CNN re-trained with transfer learning. Two image datasets of Lego bricks are generated for the experiment: a large synthetic dataset generated from a Lego 3D model and a small dataset of photos.

Results show that the amplified dataset produces a worse classification accuracy compared to no augmentation by 82\% versus 68\% after the augmentation. This observation cannot be extrapolated due to differences found between the natural and synthetic data that might have affected the recognisably. Despite of that, synthetic datasets still have a lot of potential in situations where image datasets are difficult to obtain. Further research should investigate how improvements in the rendering process influence image recognition.
Udgivelsesdato26 jan. 2017
Antal sider51
ID: 250298375