Listening Beyond Words: Transfer Learning for Audio Deepfake Detection

Studenteropgave: Kandidatspeciale og HD afgangsprojekt

  • Gustav Arnt Palmelund Bonvang
4. semester, Cybersikkerhed, kandidat (Kandidatuddannelse)
This project investigates the topic of audio deepfake detection. First, existing research in the field is examined through which it is found that residual neural networks and the use of transfer learning have the potential to perform well in the context of audio deepfake detection. Thus, a novel approach to the task is proposed which is based on the ResNet50 network architecture and the use of transfer learning. Models are trained and evaluated using the In-The-Wild dataset, which is converted to a set of mel spectrograms and re-scaled prior to use in the neural network. Several models are trained using different hyperparameters, including a range of baseline models which do not use transfer learning. The best model used transfer learning and achieved an accuracy of 96.7% and an F1-score of 95.5%, while a comparison to the non-transfer learning baseline models showed an average 21.90% increase in accuracy and 44.01% increase in F1-score when using transfer learning.
Udgivelsesdato1 jun. 2023
Antal sider51
ID: 532592890