AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Model-based Analysis and Synthesis of Aging Effects on Human Voice Production

Author

Term

4. Term

Publication year

2020

Submitted on

Abstract

This thesis explores voice synthesis by combining physics-based modeling with analytical methods to support the design of an ageing voice model. A secondary contribution is an adapted, complementary analysis tool developed in parallel. The overall goal is a parametric model in which age is a controllable parameter; this document provides the theoretical foundation. Although voice synthesis is widely used, few systems let users adjust perceived age, and related work mostly transforms existing recordings to sound older or younger. Because voice ageing is complex and not yet fully understood, the thesis compiles knowledge about voice production and description, how voices change across the lifespan, the relevant physics and physiology, and the computational techniques used to model them. It then introduces the developed ageing model and a fixed-age voice model (FAM), and evaluates both for perceived credibility and sound quality.

Specialet undersøger stemmesyntese ved at kombinere fysikbaserede modeller med analytiske metoder for at understøtte designet af en aldringsmodel for stemmen. En sidegevinst er et tilpasset, supplerende analyseværktøj, udviklet parallelt. Målet er en parametrisk model, hvor alder er en justerbar parameter; dokumentet udgør det teoretiske grundlag. Selvom stemmesyntese er udbredt, lader få systemer brugeren justere den oplevede alder, og beslægtet forskning ændrer typisk eksisterende optagelser, så de lyder ældre eller yngre. Fordi stemmealdring er kompleks og endnu ikke fuldt forstået, samler specialet viden om stemmens produktion og beskrivelse, dens udvikling gennem livet, den relevante fysik og fysiologi samt de beregningsteknikker, der bruges til at modellere den. Dernæst introduceres den udviklede aldringsmodel og en fast-alder-stemmemodel (FAM), som begge vurderes med hensyn til troværdighed og lydkvalitet.

[This apstract has been rewritten with the help of AI based on the project's original abstract]