• Benjamin Elif Larsen
4. Term, Sound and Music Computing (Master Programme)
The purpose of this project was to implement a non-negative matrix factorisation (NMF) method and evaluate its ability to extract speech from a mixed signal with non-stationary noise, the noise being wind. It was compared to the state-of-the-art (Non-negative sparse coding, NNSC), the non-processed signals, and two noise reduction methods for stationary noise. The methods were tested and evaluated over different conditions. The conditions being the number of wind and speech components, the signal-to-noise ratio (SNR), and two different β-divergence values.

Two different dictionaries were trained, a speech and a wind dictionary.

The measurements used for the evaluate were the PESQ and the STOI. The SNRout was measured for the NMF and the state-of-the-art.

The results indicate that the NMF failed at extracting the speech and wind from the mixed signals as it overall scored lower than the non-processed signals and the two stationary noise reduction methods, while for the most of the time it did similar to the NNSC method.
The NNSC had been found to give good results, which could indicate that the number of signals used for the training of the speech and wind dictionaries was not high enough to allow the NMF and NNSC to be able to extraction untrained speech and wind signals. At the same time, it was noticed that a lot of distortion was present in the audio signals, which could indicate that the dictionaries extracted parts of the wrong source.
Publication date29 May 2018
Number of pages105
ID: 279994212