AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Vector quantization (VQ)-based generative DNN models for low delay speech and audio coding: A model-based approach to encoding packets for packet-loss robustness.

Term

4. semester

Publication year

2024

Submitted on

Pages

57

Abstract

Recent developments in the state-of-the-art of audio compression have led to models achieving low bit rates while maintaining a good reconstruction of the compressed embeddings. The findings make it interesting to explore model-based techniques for making audio packages robust to packet loss, which led to the development of three models with varying bit rates in this project. The three models had bit rates of 768kbps, 192kbps and 6kbps and were trained on the LibriTTS Corpus dataset, where data samples had a bit rate of 384kbps. The largest model showed the best potential for package loss, where it had a good reconstruction ranging from 20\% to 80\% packet loss probability. The main limitation of the results seemed to be the underlying autoencoders, which opens up for future work applying the same technique for more improved frameworks at lower bitrates.