Event Title
Does Phase Matter for Monaural Source Separation?
Location
Science Center A254
Start Date
10-27-2017 3:00 PM
End Date
10-27-2017 4:20 PM
Research Program
New Mexico Consortium Machine Learning Summer Program
Abstract
The "cocktail party" problem of fully separating multiple sources from a single channel audio waveform remains unsolved. Current understanding of neural encoding suggests that phase information is preserved and utilized at every stage of the auditory pathway. However, current computational approaches primarily discard phase information in order to mask for amplitude spectrograms of sound. In this paper, we seek to address whether preserving phase information in spectral representations of sound provides better results in monaural separation of vocals from a musical track by using a neurally plausible sparse generative model. Our results show that preserving phase information introduces less artifacts in the separated tracks, as quantified by the signal to artifact ratio (GSAR). Furthermore, our proposed method achieves state-of-the-art performance for source separation, as quantified by a mean signal to interference ratio (GSIR) of 19.46.
Recommended Citation
Dubey, Mohit, "Does Phase Matter for Monaural Source Separation?" (2017). Celebration of Undergraduate Research. 1.
https://digitalcommons.oberlin.edu/cour/2017/panel_04/1
Major
Physics; Classical Guitar
Project Mentor(s)
Garrett Kenyon, Neuromorphic Computing, New Mexico Consortium
Document Type
Presentation
Does Phase Matter for Monaural Source Separation?
Science Center A254
The "cocktail party" problem of fully separating multiple sources from a single channel audio waveform remains unsolved. Current understanding of neural encoding suggests that phase information is preserved and utilized at every stage of the auditory pathway. However, current computational approaches primarily discard phase information in order to mask for amplitude spectrograms of sound. In this paper, we seek to address whether preserving phase information in spectral representations of sound provides better results in monaural separation of vocals from a musical track by using a neurally plausible sparse generative model. Our results show that preserving phase information introduces less artifacts in the separated tracks, as quantified by the signal to artifact ratio (GSAR). Furthermore, our proposed method achieves state-of-the-art performance for source separation, as quantified by a mean signal to interference ratio (GSIR) of 19.46.
Notes
Session I, Panel 4 - Sound | Science
Moderator: Joseph Lubben, Associate Professor of Music Theory