• Jacob Møller Hjerrild Hansen
4. semester, Lyd og Musik, Kandidat (Kandidatuddannelse)
In this thesis, we propose a novel source parameter estimator for stereophonic mixtures, allowing for panning parameter estimation on multi-channel audio, even if the source pitches and harmonic amplitudes are unknown. The presented method does not require prior knowledge of the number of sources present in the mixture. The estimator is formulated using an unsupervised learning framework, using Bayesian statistics, allowing for optimal segmentation of the stereophonic signal, based on maximum a posteriori modelling of source parameters.

In the proposed method, we model the distribution of panning parameters with a Gaussian mixture model (GMM). Then we estimate the model parameters by using the maximum a posteriori (MAP) estimation based on the expectation-maximization (EM) algorithm. In order to avoid one cluster being modeled by two or more Gaussians, we utilize a sparse distribution modeled by the Dirichlet distributions as the prior of the GMM mixture probabilities, along with a model pruning algorithm. Moreover, to obtain a better time segmentation of the stereophonic mixtures, we propose to apply a segmentation scheme that guarantees the global optimality, based on the cost function of the maximum a posteriori model.
The developed estimator is evaluated through simulations on synthetic signals as well as on real audio signals. These simulations show that the developed estimator performs good in terms of source parameter estimation and number or sources in the stereophonic mixture.
SprogEngelsk
Udgivelsesdato22 maj 2017
Antal sider95
ID: 258066310