Analysis, Design and Implementation Considerations of a Speech Coder Based on LPCNet
Student thesis: Master Thesis and HD Thesis
- Christian Bekhøi Roskær
4. term, Signal Processing and Computing, Master (Master Programme)
In this project made in collaboration with
RTX, we examine the newly proposed
voice decoding neural network called LPC-
Net, with the intent of implementation on
an embedded device. This work includes
an analysis of the many methods and the-
ories utilized by LPCNet. An error of the
source code is corrected, and new mod-
els are trained using the TIMIT data set,
yielding intelligible but unnatural sound-
ing speech.
The part of the LPCNet with the high-
est computational complexity is selected,
mainly consisting of two GRU layers. This
sub-algorithm of the LPCNet is then fur-
ther analysed, in order to gain insights
of its inner workings. This analysis
have among others resulted in the pro-
posed Block Compressed Sparse Column
(BCSC) format, as a means to address the
block-sparse matrices utilized by the net-
work.
In an FPGA based approach different
data moment schemes where explored.
In order to increase energy efficiency,
while maintain the throughput of the
solution. In a CPU based implementation
different causes for processing stalls were
investigated, and memory hierarchy was
identified as the biggest reason for stalls
of the sub-algorithm. Parallel processing
have been emphasized in both approaches.
Practical implementation results of these
approaches have not been obtained within
the time frame of the project.
RTX, we examine the newly proposed
voice decoding neural network called LPC-
Net, with the intent of implementation on
an embedded device. This work includes
an analysis of the many methods and the-
ories utilized by LPCNet. An error of the
source code is corrected, and new mod-
els are trained using the TIMIT data set,
yielding intelligible but unnatural sound-
ing speech.
The part of the LPCNet with the high-
est computational complexity is selected,
mainly consisting of two GRU layers. This
sub-algorithm of the LPCNet is then fur-
ther analysed, in order to gain insights
of its inner workings. This analysis
have among others resulted in the pro-
posed Block Compressed Sparse Column
(BCSC) format, as a means to address the
block-sparse matrices utilized by the net-
work.
In an FPGA based approach different
data moment schemes where explored.
In order to increase energy efficiency,
while maintain the throughput of the
solution. In a CPU based implementation
different causes for processing stalls were
investigated, and memory hierarchy was
identified as the biggest reason for stalls
of the sub-algorithm. Parallel processing
have been emphasized in both approaches.
Practical implementation results of these
approaches have not been obtained within
the time frame of the project.
Language | English |
---|---|
Publication date | 30 Jun 2020 |
Number of pages | 166 |
External collaborator | RTX A/S Christopher Meisner cm@rtx.dk Other |