Analysis, Design and Implementation Considerations of a Speech Coder Based on LPCNet
Author
Term
4. term
Education
Publication year
2020
Submitted on
2020-06-03
Pages
166
Abstract
In this project made in collaboration with RTX, we examine the newly proposed voice decoding neural network called LPC- Net, with the intent of implementation on an embedded device. This work includes an analysis of the many methods and the- ories utilized by LPCNet. An error of the source code is corrected, and new mod- els are trained using the TIMIT data set, yielding intelligible but unnatural sound- ing speech. The part of the LPCNet with the high- est computational complexity is selected, mainly consisting of two GRU layers. This sub-algorithm of the LPCNet is then fur- ther analysed, in order to gain insights of its inner workings. This analysis have among others resulted in the pro- posed Block Compressed Sparse Column (BCSC) format, as a means to address the block-sparse matrices utilized by the net- work. In an FPGA based approach different data moment schemes where explored. In order to increase energy efficiency, while maintain the throughput of the solution. In a CPU based implementation different causes for processing stalls were investigated, and memory hierarchy was identified as the biggest reason for stalls of the sub-algorithm. Parallel processing have been emphasized in both approaches. Practical implementation results of these approaches have not been obtained within the time frame of the project.
Documents
