What we did
This is a sponsored PhD project by Darius Satongar from the University of Salford.
Loudspeaker-based spatial audio systems are often designed with the aim to create an auditory event or scene to a listener positioned in the optimal listening position. However, in real-world domestic listening environments, listeners can be distributed across the listening area. Any translational change from the central listening position will introduce artefacts which can be challenging to evaluate perceptually.
Simulation of a loudspeaker system using non-individualised dynamic binaural synthesis is one solution to this problem However, the validity in using such systems is not well proven. This thesis measures the limitations of using a non-individualised, dynamic binaural synthesis system to simulate the perception of loudspeaker-based panning methods across the listening area. The binaural simulation system was designed and verified in collaboration with 91Èȱ¬ Research & Development. The equivalence of localisation errors caused by loudspeaker-based panning methods between in situ and binaural simulation was measured where it was found that localisation errors were equivalent to a +/-7 degrees boundary in 75% of the spatial audio reproduction systems tested.
Results were then compared to a computation localisation model which was adapted to utilise head-rotations. The equivalence of human acuity to sound colouration between in situ and when using non-individualised binaural simulation was measured using colouration detection thresholds from five directions. It was shown that thresholds were equivalent within a +/-4dB equivalence boundary, supporting the use for simulating sound colourations caused by loudspeaker-based panning methods.
The binaural system was finally applied to measure the perception of multi-loudspeaker induced colouration artefacts across the listening area. It was found that the central listening position had the lowest perceived colouration. It is also shown that the variation in perceived colouration across the listening area is larger for reverberant reproduction conditions.
Why it matters
In the context of broadcast loudspeaker-based spatial audio, there are a vast amount of elements that are combined to record, store, transmit and reproduce a sound field that is pleasing to the user, accurate or in-line with artistic intent. There are a number of challenges that each element of the broadcast chain introduces and in order to alleviate these challenges, they must first be understood.
One key challenge of domestic spatial audio reproduction over loudspeakers is the ability to reproduce a desired sound field over a large listening area. Ambisonics has highlighted significant proposed solutions to this problem including an increase in sweet spot size relative to Ambisonic order and also the use of ‘sweet-spot’ specific decoders. However, the technical challenges presented by conducting perceptual tests for a number of listening positions has meant that testing often favours optimal listening positions or off-centre perception is only considered objectively.
Our Goals
- Use psychoacoustic metrics and relevant modelling to compare localisation at central and non-central listening positions
- Develop a state-of-the-art Auditory Virtual Environment (AVE) capable of simulating non-central listening position artifacts
- Validate the AVE at non-central listening positions for the perception of localisation and colouration artifacts. Technical aspects will also be validated such as matching auditory-visual modalties, total system latency and sound source plausibility testing
- Publish a spatially sampled binaural room impulse response dataset
- Understand the relationship between spatial audio reproduction system variables and listening area characteristics for the perception of localisation and colouration
- Define whether 2.5-D Ambisonic reproduction can improve listening area characteristics compared to currently reproduction systems
- Compare subjective results against objective measurements and perceptual modelling using state-of-the-art techniques
Outcomes
- Dr Darius Satongar's PhD thesis, titled Simulation and analysis of spatial audio reproduction and listening area effects, is now available.
- Use of psychoacoustic localisation metrics (ITD and ILD) to estimate objective quality at non-central listening positions. This was published as a 91Èȱ¬ white paper.
- Development and validation of an optical motion tracked binaural system. This validation consisted of tracking accuracy, latency and system reliability tests alongside informal comparisons of in-situ and AVE sources.
- Assessment of the effect of headphone transparency to external sources in the context of binaural validation tests. This was published as a 91Èȱ¬ white paper.
- Measurement of a high-resolution, spatially-sampled binaural room impulse response dataset using the listening room at the University of Salford. The SBS-BRIR dataset was in March 2014.
- Use of the AVE and SBS-BRIR dataset to perform validation tests for the use of the AVE to simulate artifacts produced by non-central listening in domestic, loudspeaker-based spatial audio systems. This validation consisted of localisation and colouration threshold perception tests.
How it works
Psychoacoustic modelling is used by comparing interaural-time and interaural-level differences for simulated non-central listening positions. This provides a prediction for the performance of spatial audio reproduction systems across the listening area.
The AVE utilises a binaural reproduction system developed in collaboration with the Binaural Broadcasting project. Binaural technologies allow us to create virtual sound environments through only headphone reproduction and a motion tracking system - this is done by simulating the filtering effect performed by the human head, torso and ear. The effect can be so realistic many listeners struggle to differentiate between a real loudspeaker and a virtual loudspeaker. Low latency motion tracking is achieved using a state-of-the-art optical motion tracking system which also gives data for bio-mechanical analysis of listener characteristics.
Because we can simulate loudspeakers within a room, we can virtually move the listener to many listening positions within the room to test a listeners response to a system at that listening position. This allows us to test many perceptual features of the spatial audio system such as their spatial attributes or perceived colouration and how these change across the listening area.