Listening Beyond Words: Transfer Learning for Audio Deepfake Detection
Student thesis: Master Thesis and HD Thesis
- Gustav Arnt Palmelund Bonvang
4. semester, Master of Science (MSc) in Cyber Security (Master Programme)
This project investigates the topic of audio deepfake detection. First, existing research in the field is examined through which it is found that residual neural networks and the use of transfer learning have the potential to perform well in the context of audio deepfake detection. Thus, a novel approach to the task is proposed which is based on the ResNet50 network architecture and the use of transfer learning. Models are trained and evaluated using the In-The-Wild dataset, which is converted to a set of mel spectrograms and re-scaled prior to use in the neural network. Several models are trained using different hyperparameters, including a range of baseline models which do not use transfer learning. The best model used transfer learning and achieved an accuracy of 96.7% and an F1-score of 95.5%, while a comparison to the non-transfer learning baseline models showed an average 21.90% increase in accuracy and 44.01% increase in F1-score when using transfer learning.
Language | English |
---|---|
Publication date | 1 Jun 2023 |
Number of pages | 51 |