• Amalie Vistoft Petersen
  • Jacob Theilgaard Lassen
  • Sebastian Biegel Schiøler
This project investigates the use of neural networks for detecting adversarial examples for speech recognition with five different feature extraction methods as input. These are STFT, MFCC, IMFCC, GFCC and IGFCC. Relevant theory in the areas of deep learning, adversarial examples and speech processing is examined, and a description is made of the available white box and black box datasets. A CNN model is implemented and evaluated w.r.t performance and robustness of the different feature extraction methods. This includes an investigation into how the performance is affected when only speech or nonspeech is present in the data. The addition of different types and amounts of noise in the data is investigated to determine how it affects the performance of the model. It is concluded that the CNN model is able to detect adversarial examples for speech recognition and the IMFCC and IGFCC feature extraction methods in general have highest accuracies. Furthermore the model is generally more robust to noise when its training set contains a wider range of noise types.
Udgivelsesdato6 jun. 2019
ID: 305178683