Author(s)
Term
4. Term
Education
Publication year
2025
Submitted on
2025-05-26
Pages
9 pages
Abstract
In this work, we demonstrate the feasibility of low-latency speech enhancement using Deep Neural Networks (DNNs), aimed at the integration into consumer products, such as loudspeakers, soundbars, and portable speakers. This often requires full-band audio processing on already computationally loaded devices with limited resources. By combining state-of-the-art technologies, such as low-complexity Deep Noise Suppression (DNS) networks, asymmetric STFT-iSTFT windowing scheme and dataset for Cinematic Audio Source Separation (CASS), we achieve real-time execution on various platforms and low algorithmic latency of 11 ms. The presented models have been designed thanks to an objective evaluation-guided process, followed by a perceptual subjective evaluation to validate their performance. While promising and sufficient for the demonstrative nature of the work, the perceptual performance is not satisfactory for a customer ready implementation. However, the results support the potential of our approach, shortening the gap between research and real-world application in consumer electronics.
Documents
Colophon: This page is part of the AAU Student Projects portal, which is run by Aalborg University. Here, you can find and download publicly available bachelor's theses and master's projects from across the university dating from 2008 onwards. Student projects from before 2008 are available in printed form at Aalborg University Library.
If you have any questions about AAU Student Projects or the research registration, dissemination and analysis at Aalborg University, please feel free to contact the VBN team. You can also find more information in the AAU Student Projects FAQs.