Artificial Bandwidth Extension of Narrowband Speech

Studenteropgave: Speciale (inkl. HD afgangsprojekt)

  • Nels Rohde
  • Svend Aage Vedstesen
This thesis addresses the challenges of estimating wideband speech (0-8000 Hz) from narrowband speech (0-3400 Hz). This is done by estimating the missing upper spectral components from the narrowband speech using statistical approaches. Utilizing the Source-Filter model, the estimation problem is divided into estimating a wideband envelope and a wideband excitation signal. These two estimates are then combined to obtain an artificially extended wideband speech signal. Three methods based on Vector Quantization, Gaussian Mixture Models and Hidden Markov Models respectively, have been developed for estimation of the wideband envelope. Results show that the two later outperforms the method based on vector quantization, in both objective and audible results. Estimation of excitation is done by simple spectral replication. A new perceptual training procedure which utilizes Mel Frequency Cepstral Coefficients for estimation of the envelope is proposed. A formal listening test conclude, that the proposed method of extending the wideband speech, is preferred over bandlimited narrow band speech with a level of significance of more than 99
Antal sider103
Udgivende institutionAalborg Universitet
ID: 9924415