
Synchronized Audio-Visual Generation with a Joint Generative Diffusion Model and Contrastive Loss
The rapid development of deep learning techniques has led to significant advancements in the fields of multimedia generation and synthesis. However, generating coherent and temporally aligned audio and video content remains a challenging task due…
Binaural spatial audio positioning in video calls
Spatially separating voices plays a crucial role in speech intelligibility, speaker identification and cognitive load in conversations. Voices are naturally separated in in-person conversations, but in most video conferencing software voices are mixed down to…
Undergraduate Research Internship – Computing
This program is for candidates who are passionate about technology and offer diverse perspectives. We don’t just value differences, we seek them out. Interns put inquiry and theory into practice. Alongside doctoral interns, and some…
Final intern talk: Improving Frechet Audio Distance for Generative Music Evaluation
As generative music models become more powerful and popular, there is a growing need for robust objective metrics of music quality that correlates with human perception. The Frechet Audio Distance (FAD) is a commonly used…
ICASSP 2023 Acoustic Echo Cancellation Challenge
Large-Scale Automatic Audiobook Creation
HyWay: Physical Walk (MSR India – TAB Feb 2023)
A key aspect of attending such an event in person is being able to experience the setting in its fullness — hearing the buzz of background conversations and seeing who is around. This can be…