Spectral Speech Enhancement using Deep Neural Networks: - Design, Analysis \& Evaluation -

Authors

Jacobsen, Anders Post ; Kolbæk, Morten

Term

4. term

Education

Signal Processing and Computing, Master

Publication year

2015

Submitted on

2015-06-02

Pages

160

Abstract

Speech enhancement is an important issue within a wide range of applications such as mobile phones, speech recognition and hearing aids. In various acoustic environments, especially at low \acp{SNR}, the goal of speech enhancement methods is to solve the cocktail party problem. Regarding intelligibility, different machine learning methods that aim to estimate an ideal binary mask have revealed promising results. This master's thesis covers the work of speech enhancement by use of the machine learning method \ac{DNN}. In particular, a MATLAB implementation of a system based on \acp{DNN} for estimating an ideal binary mask was carried out. Simulations have revealed that the proposed \ac{DNN} based speech enhancement algorithm can enhance noisy speech in terms of an intelligibility predictor (STOI) and a quality predictor (PESQ). Likewise, it has been found that by using a soft mask, instead of a binary mask, additional improvement in STOI and PESQ can be achieved. The project is suggested and motivated by both Aalborg University and Oticon A/S.

Keywords

Speech enhancement ; Ideal binary mask ; Deep neural network ; Restricted boltzmann machine ; Deep belief network ; Wiener filter ; MMSE Estimator ; Speech intelligibility ; Speech quality ; STOI ; PESQ ; Soft mask

Documents

Download
View record in AAU Student Projects

A master's thesis from Aalborg University

Spectral Speech Enhancement using Deep Neural Networks: - Design, Analysis \& Evaluation -