AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


A Study on the Effects of Individual and Nonindividual Binaural Recordings on Incongruent Visual and Auditory Inputs

Author

Term

4. term

Publication year

2018

Submitted on

Pages

110

Abstract

Når syn og lyd ikke passer sammen, laver mennesker flere fejl, når de skal afgøre, hvor en lyd kommer fra, end når sanseindtrykkene er i overensstemmelse. Ved binaural gengivelse – optagelser eller syntese, der efterligner, hvordan vi hører med to ører og vores egen hoved- og kropsform – lokaliserer lyttere typisk bedre, når optagelserne er tilpasset deres egne fysiske kendetegn. I dette studie undersøgte vi, hvordan fem deltagere (23–31 år, gennemsnit 26) lokaliserer lyde i tre situationer: den virkelige verden (RL), virtual reality (VR) og uden visuelle indtryk (NV). VR- og NV-scenarierne blev designet, så syn og lyd ikke var kongruente. Først udførte deltagerne en lokaliseringsopgave i et rigtigt rum. Derefter blev der lavet binaurale optagelser for hver deltager med deres eget hoved samt med en Head And Torso Simulator (HATS) til brug i VR- og NV-afspilning. Vi testede to typer lyd: tale og støj, og sammenlignede individuelle optagelser (egen) med ikke‑individuelle (andres). Svarene blev analyseret statistisk. Resultaterne viste, at deltagerne præsterede bedst i den virkelige situation og dårligere ved aflytning af binaurale optagelser. Inden for de binaurale betingelser klarede de sig bedre, når de lyttede til deres egne optagelser frem for andres, men kun for tale. Den manglende forbedring for støj kan skyldes, at der blev brugt pink støj. På tværs af fejltyper var det sværest at skelne mellem højttalere i medianplanet (midterplanet foran lytteren), især når højttalerne stod tæt.

When sight and sound do not match, people make more mistakes in judging where a sound comes from than when the cues are aligned. In binaural reproduction—recordings or synthesis that mimic how we hear with two ears and our own head and body shape—listeners usually localize better when the material matches their individual physical characteristics. This study examined how five adults (ages 23–31, mean 26) localized sounds in three conditions: Real Life (RL), Virtual Reality (VR), and No Visuals (NV). The VR and NV scenarios were designed so that the audio and visual information was incongruent. Participants first completed a localization task in a real room. Binaural recordings were then made for each participant using their own head, as well as using a Head And Torso Simulator (HATS), for playback in VR and NV. We tested two sound types—speech and noise—and compared individualized (own) versus non‑individualized (others’) recordings. Responses were analyzed statistically. Participants performed best in the real-life condition and worse when listening to binaural recordings. Within binaural playback, using one’s own recordings improved localization for speech compared with using others’ recordings, but this benefit did not appear for noise, possibly because pink noise was used. Across error types, the most difficult cases involved distinguishing between loudspeakers in the median plane (the midline in front of the listener), especially when speakers were close neighbors.

[This abstract was generated with the help of AI]