Listening Beyond Words: Transfer Learning for Audio Deepfake Detection

Author

Bonvang, Gustav Arnt Palmelund

Term

4. semester

Education

Cyber Security, Master

Publication year

2023

Submitted on

2023-06-01

Pages

Abstract

This project investigates the topic of audio deepfake detection. First, existing research in the field is examined through which it is found that residual neural networks and the use of transfer learning have the potential to perform well in the context of audio deepfake detection. Thus, a novel approach to the task is proposed which is based on the ResNet50 network architecture and the use of transfer learning. Models are trained and evaluated using the In-The-Wild dataset, which is converted to a set of mel spectrograms and re-scaled prior to use in the neural network. Several models are trained using different hyperparameters, including a range of baseline models which do not use transfer learning. The best model used transfer learning and achieved an accuracy of 96.7% and an F1-score of 95.5%, while a comparison to the non-transfer learning baseline models showed an average 21.90% increase in accuracy and 44.01% increase in F1-score when using transfer learning.

Keywords

resnet ; ML ; DL ; deepfake ; lyd ; audio ; fourier ; utopia ; resnet50 ; cnn ; transfer learning

Documents

Download
View record in AAU Student Projects

A master's thesis from Aalborg University

Listening Beyond Words: Transfer Learning for Audio Deepfake Detection