• Michail Kampitakis
4. term, Signal Processing and Computing, Master (Master Programme)
Speech Enhancement systems improve the quality and intelligibility of noisy speech signals. It has been proved that conventional loss functions such as MSE do not correlate highly with how humans perceive speech, and they do not perform well in subjective listening tests. On the contrary, objective metrics used as loss functions show better performance on objective tests. However, there is not always a correlation between the subjective and objective intelligibility tests. In this Thesis, an Automatic Speech Recognition (ASR) model is employed as a loss function in the Speech Enhancement system, aiming at closing the gap in intelligibility performance between subjective and objective tests. The hypothesis is that minimizing the Word Error Rate of a noisy speech signal will improve intelligibility in both cases.
LanguageEnglish
Publication date2023
Number of pages57
ID: 532654095