AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


HydraNet: A Network For Singing Voice Separation

Author

Term

4. term

Education

Publication year

2019

Submitted on

Pages

78

Abstract

Dette speciale præsenterer HydraNet, et neuralt netværk udviklet til at løse enkeltkanals blind kildeseparation—at splitte forskellige lyde, som sangstemme og instrumenter, ud fra én lydoptagelse uden forhåndsviden om kilderne. HydraNet kombinerer idéer fra to eksisterende tilgange, Chimera og Wave-U-Net, med målet om at forbedre signal-til-forvrængningsforholdet (SDR), et udbredt mål for adskillelseskvalitet, hvor højere værdier betyder mindre forvrængning. Modellen blev implementeret i Python med PyTorch og evalueret på DSD100-musikdatasættet til vokalseparation. På dette datasæt opnåede HydraNet en SDR på 9,78 dB for instrumentseparation og 3,46 dB for vokalseparation. Til sammenligning blev PyTorch-implementeringer af Chimera og Wave-U-Net også testet på DSD100.

This thesis introduces HydraNet, a neural network designed to address single-channel blind source separation—splitting different sounds, such as vocals and instruments, from one audio recording without prior knowledge of the sources. HydraNet combines ideas from two existing approaches, Chimera and Wave-U-Net, with the goal of improving the signal-to-distortion ratio (SDR), a common measure of separation quality where higher values mean less distortion. The model was implemented in Python using PyTorch and evaluated on the DSD100 music dataset for singing voice separation. On this dataset, HydraNet achieved an SDR of 9.78 dB for instrument separation and 3.46 dB for singing voice separation. For comparison, PyTorch implementations of Chimera and Wave-U-Net were also tested on DSD100.

[This abstract was generated with the help of AI]