Resynthesizing volumetric soundscapes : low-rank subspace methods for soundfield estimation and reconstruction

Russell, Spencer(Spencer Franklin)

Author(s)

Russell, Spencer(Spencer Franklin)

Download1193026695-MIT.pdf (36.87Mb)

Other Contributors

Program in Media Arts and Sciences (Massachusetts Institute of Technology)

Advisor

Joseph A. Paradiso.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Sound and space are fundamentally intertwined, at both a physical and perceptual level. Sound radiates from vibrating materials, filling space and creating a continuous field through which a listener moves. Despite a long history of research in spatial audio, the technology to capture these sounds in space is currently limited. Egocentric (binaural or ambisonic) recording can capture sound from all directions, but only from a limited perspective. Recording individual sources and ambiance is labor-intensive, and requires manual intervention and explicit localization. In this work I propose and implement a new approach, where a distributed collection of microphones captures sound and space together, resynthesizing them for a (now-virtual) listener in a rich volumetric soundscape. This approach offers great flexibility to design new auditory experiences, as well as giving a much more semantically-meaningful description of the space.

The research is situated at the Tidmarsh Wildlife Sanctuary, a 600-acre former cranberry farm that underwent the largest-ever freshwater restoration in the northeast. It has been instrumented with a large-scale (300 by 300 m2) distributed array of 10-18 microphones which has been operating (almost) continuously for several years. This dissertation details methods for characterizing acoustic propagation in a challenging high-noise environment, and introduces a new method for correcting for clock skew between unsynchronized transmitters and receivers. It also describes a localization method capable of locating sound-producing wildlife within the monitored area, with experiments validating the accuracy to within 5m. The scale of the array provides an opportunity to investigate classical array processing techniques in a new context, with nonstationary signals and long interchannel delays.

We propose and validate a method for location-informed signal enhancement using a rank-1 spatial covariance matrix approximation, achieving 11dB SDR improvements with no source signal modeling. These components are brought together in an end-to-end demonstration system that resynthesizes a virtual soundscape from multichannel signals recorded in situ, allowing users to explore the space virtually. Positive feedback is reported in a user survey.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, May, 2020

Cataloged from the official PDF of thesis.

Includes bibliographical references (pages 106-112).

Date issued

2020

URI

https://hdl.handle.net/1721.1/127501

Department

Program in Media Arts and Sciences (Massachusetts Institute of Technology)

Publisher

Massachusetts Institute of Technology

Keywords

Program in Media Arts and Sciences

Collections

Doctoral Theses