A Hybrid Approach for Speech Enhancement with DNN Supported Acoustic Beamforming
Student thesis: Master Thesis and HD Thesis
- Poul Hoang
4. term, Signal Processing and Computing, Master (Master Programme)
Modern hearing aids often have more than one microphone available for each device. It has been shown that substantial gains in speech intelligibility can to obtained by applying multichannel signal processing methods (e.g. beamformers) to noisy observations in noisy environments such as cocktail parties or restaurant-like environments. Model-based signal processing methods might, however, perform less well in acoustic environments where the SNR is low as the unknown parameters needed for the beamformers are harder to estimate. The motivation behind the work presented in this thesis, is thus to explore the possibility of applying a deep neural network (DNN) to support an acoustic beamformer as an alternative to the model-based methods. The DNN will in this thesis specifically estimate the direction-of-arrival (DOA) and the relative transfer function (RTF) vector needed for the examined beamformers.
We have proposed three types of DNN supported beamformers in this thesis: 1) A minimum power distortionless response (MPDR) beamformer supported by a DNN for DOA estimation, 2) an MPDR beamformer supported by a DNN estimating RTF-vectors, and 3) a Bayesian beamformer where the posterior probabilities are estimated by a DNN. The experimental results show that the DNN-supported beamformers are able to outperform a model-based Bayesian beamformer in acoustic scenes with isotropic babble noise in terms of ESTOI, PESQ, and segSNR scores.
We have proposed three types of DNN supported beamformers in this thesis: 1) A minimum power distortionless response (MPDR) beamformer supported by a DNN for DOA estimation, 2) an MPDR beamformer supported by a DNN estimating RTF-vectors, and 3) a Bayesian beamformer where the posterior probabilities are estimated by a DNN. The experimental results show that the DNN-supported beamformers are able to outperform a model-based Bayesian beamformer in acoustic scenes with isotropic babble noise in terms of ESTOI, PESQ, and segSNR scores.
Language | English |
---|---|
Publication date | 7 Jun 2018 |
Number of pages | 80 |