Deep Learning Approaches to Art Style Recognition in Digital Images
Studenteropgave: Kandidatspeciale og HD afgangsprojekt
- Rasmus Hove Johnsen
- Andrea Gradecak
4. semester, Software, Kandidat (Kandidatuddannelse)
Convolutional Neural Networks(CNNs) have become state-of-the-art image recognition models, but have not been used to significant effect for art style recognition in fine art paintings.
Through incremental experimentation with a number of aspects of CNNs, we have build a model for this task. The model has 7 blocks, containing one convolutional layer with rectified linear units and a max-pooling layer. It has 32 feature maps in the first convolutional layer, which is doubled after each max-pooling layer. At the end of the network there is one fully connected layer with 8 neurons and a softmax output layer. As baseline we have used the VGG16 network with a Support Vector Machine(SVM) as classifier. To compensate for the relatively small size of the dataset, we employed a sliding window cropping technique, taking a maximum of 10 crops from the original image to inflate the dataset.
Testing on three pairs of styles we reached the highest test accuracy of 95.3% on Color Field Painting and Magic Realism. However, this was not enough to beat the baseline, that reached a test accuracy of 97.8%.
To conclude, we find that without aggressive augmentation, training purely on fine art paintings for style recognition, is not viably better than using a pretrained CNN and an SVM classifier.
Through incremental experimentation with a number of aspects of CNNs, we have build a model for this task. The model has 7 blocks, containing one convolutional layer with rectified linear units and a max-pooling layer. It has 32 feature maps in the first convolutional layer, which is doubled after each max-pooling layer. At the end of the network there is one fully connected layer with 8 neurons and a softmax output layer. As baseline we have used the VGG16 network with a Support Vector Machine(SVM) as classifier. To compensate for the relatively small size of the dataset, we employed a sliding window cropping technique, taking a maximum of 10 crops from the original image to inflate the dataset.
Testing on three pairs of styles we reached the highest test accuracy of 95.3% on Color Field Painting and Magic Realism. However, this was not enough to beat the baseline, that reached a test accuracy of 97.8%.
To conclude, we find that without aggressive augmentation, training purely on fine art paintings for style recognition, is not viably better than using a pretrained CNN and an SVM classifier.
Sprog | Engelsk |
---|---|
Udgivelsesdato | 7 jun. 2017 |
Antal sider | 81 |