• Carina Enevold Andersen
  • Dennis Alexander Lehmann Thomsen
4. semester, Signalbehandling og Beregning, Kandidat (Kandidatuddannelse)
This project investigates any potential relationship between the performances of noise reduction algorithms in the context of speech recognition and speech enhancement.
General theory related to speech production and hearing is presented together with the basics of the Mel-frequency cepstral coefficients speech feature.
The fundamental theory of hidden Markov model speech recognition is stated along with the standard feature-extraction method European telecommunication standards institute (ETSI) advanced frontend (AFE).
The performance of the ETSI AFE algorithm and state-of-the-art speech enhancement algorithms are investigated in both fields using speech data from the Aurora-2 database.
The aggressiveness of the noise reduction applied has been identified as a major difference between the algorithms from the two fields, and has been adjusted to increase performance in the rivalling field.
Using a logistic model, estimators of recognition performance are created for the ETSI AFE using the distortion measures for speech quality and intelligibility.
The most accurate estimator of the recognition performance of the ETSI AFE, proved to be the one designed for short-time objective intelligibility measure using a recogniser trained with clean and noisy speech data.



SprogEngelsk
Udgivelsesdato3 jun. 2015
Antal sider144
Ekstern samarbejdspartnerOticon
Jesper Jensen jsj@oticon.dk
Anden
ID: 213552058