Author(s)
Term
4. Term
Education
Publication year
2018
Submitted on
2018-05-28
Pages
105 pages
Abstract
The purpose of this project was to implement a non-negative matrix factorisation (NMF) method and evaluate its ability to extract speech from a mixed signal with non-stationary noise, the noise being wind. It was compared to the state-of-the-art (Non-negative sparse coding, NNSC), the non-processed signals, and two noise reduction methods for stationary noise. The methods were tested and evaluated over different conditions. The conditions being the number of wind and speech components, the signal-to-noise ratio (SNR), and two different β-divergence values. Two different dictionaries were trained, a speech and a wind dictionary. The measurements used for the evaluate were the PESQ and the STOI. The SNRout was measured for the NMF and the state-of-the-art. The results indicate that the NMF failed at extracting the speech and wind from the mixed signals as it overall scored lower than the non-processed signals and the two stationary noise reduction methods, while for the most of the time it did similar to the NNSC method. The NNSC had been found to give good results, which could indicate that the number of signals used for the training of the speech and wind dictionaries was not high enough to allow the NMF and NNSC to be able to extraction untrained speech and wind signals. At the same time, it was noticed that a lot of distortion was present in the audio signals, which could indicate that the dictionaries extracted parts of the wrong source.
The purpose of this project was to implement a non-negative matrix factorisation (NMF) method and evaluate its ability to extract speech from a mixed signal with non-stationary noise, the noise being wind. It was compared to the state-of-the-art (Non-negative sparse coding, NNSC), the non-processed signals, and two noise reduction methods for stationary noise. The methods were tested and evaluated over different conditions. The conditions being the number of wind and speech components, the signal-to-noise ratio (SNR), and two different β-divergence values. Two different dictionaries were trained, a speech and a wind dictionary. The measurements used for the evaluate were the PESQ and the STOI. The SNRout was measured for the NMF and the state-of-the-art. The results indicate that the NMF failed at extracting the speech and wind from the mixed signals as it overall scored lower than the non-processed signals and the two stationary noise reduction methods, while for the most of the time it did similar to the NNSC method. The NNSC had been found to give good results, which could indicate that the number of signals used for the training of the speech and wind dictionaries was not high enough to allow the NMF and NNSC to be able to extraction untrained speech and wind signals. At the same time, it was noticed that a lot of distortion was present in the audio signals, which could indicate that the dictionaries extracted parts of the wrong source.
Documents
Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.
If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.