Speaker recognition using Universal background model on YOHO database
Author
Majetniak, Alexandre
Term
10. term
Education
Publication year
2011
Submitted on
2011-05-31
Pages
54
Abstract
Denne afhandling undersøger stemmegenkendelse med YOHO-lyddatabasen. Vi anvender en Gaussian Mixture Model (GMM) sammen med en Universal Background Model (UBM), også kaldet en "world" model. Kort fortalt lærer systemet særprægede mønstre i den enkelte talers stemme og sammenligner dem med en bred, generisk model over mange talere for at afgøre, hvem der taler. Vi beskriver opsætningen til træning og test på YOHO og vurderer, hvordan tilgangen klarer sig under disse forhold, med praktiske overvejelser og begrænsninger.
This thesis studies speaker recognition using the YOHO voice database. We use a Gaussian Mixture Model (GMM) together with a Universal Background Model (UBM), also called a "world" model. In plain terms, the system learns distinctive patterns in each speaker’s voice and compares them to a broad, generic model of many speakers to decide who is speaking. We outline the training and testing setup on YOHO and assess how this approach performs under those conditions, noting practical considerations and limitations.
[This abstract was generated with the help of AI]
Documents
