Concatenated convolution framelets in audio compression
Authors
Hansen, Thomas Rune ; Maniragaba, Hoza Benjamin ; Pedersen, Mathias Bach
Term
4. semester
Education
Publication year
2018
Submitted on
2018-06-07
Pages
70
Abstract
Dette projekt undersøger convolution framelets, en nyere metode til at repræsentere signaler med redundante 'tight frames' (Yin m.fl., 2017), til lydkomprimering. For at tilpasse metoden til komprimering foreslår vi at styre redundansen ved at sammenkæde flere mindre redundante framelets. Det gør det muligt at bruge flere patch-størrelser (vindueslængder), men med færre patches af hver størrelse. Komprimeringskæden, inspireret af Ravelli m.fl. (2008), har to trin: (1) beskrive lyden med et sparsomt sæt framekoefficienter og (2) kode disse koefficienter effektivt. Sparsitet opnås med Orthogonal Matching Pursuit (OMP), en grådig algoritme, der vælger få vigtige komponenter (Foucart og Rauhut, 2013). Til kodning anvender vi bitplane run-length-kodning med interleaving, som hos Ravelli m.fl. Vi tester på et sæt musikuddrag og vurderer kvaliteten ved perceptuel evaluering af lydkvalitet. Resultaterne sammenlignes med en MP3-koder og de ikke-psykoakustiske resultater rapporteret af Ravelli m.fl. Vores metode klarer sig dårligere ved lave bithastigheder.
This project investigates convolution framelets, a recently introduced way to represent signals using redundant tight frames (Yin et al., 2017), for audio compression. To adapt the method to compression, we control redundancy by concatenating several less-redundant framelets. This allows the use of multiple patch sizes (window lengths) while using fewer patches of each size. Our compression pipeline, inspired by Ravelli et al. (2008), has two stages: (1) represent the audio with a sparse set of frame coefficients and (2) encode these coefficients efficiently. We obtain sparsity with Orthogonal Matching Pursuit (OMP), a greedy algorithm that selects a few important components (Foucart and Rauhut, 2013). For coding, we follow a bitplane run-length scheme with interleaving, as in Ravelli et al. We test on a set of music excerpts and estimate quality using perceptual evaluation of audio quality. We compare our results to an MP3 encoder and to the non-psychoacoustic results reported by Ravelli et al. Our method does not perform as well at low bit rates.
[This abstract was generated with the help of AI]
Keywords
convolution ; tight frame ; framelet ; audio ; compression ; orthogonal ; matching pursuit ; run length
Documents
