Audio and acoustics

Video

Synchronized Audio-Visual Generation with a Joint Generative Diffusion Model and Contrastive Loss

November 6, 2023

The rapid development of deep learning techniques has led to significant advancements in the fields of multimedia generation and synthesis. However, generating coherent and temporally aligned audio and video content remains a challenging task due…

46:11

Video

Binaural spatial audio positioning in video calls

October 4, 2023

Spatially separating voices plays a crucial role in speech intelligibility, speaker identification and cognitive load in conversations. Voices are naturally separated in in-person conversations, but in most video conferencing software voices are mixed down to…

01:03:57

Publication

Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser

Yung-Hsuan Lai, Yen-Chun Chen, Yu-Chiang Frank Wang

NeurIPS 2023 | October 2023

Publication

Imitator: Personalized Speech-driven 3D Facial Animation

Balamurugan Thambiraja, Ikhsanul Habibie, Sadegh Aliakbarian, Darren Cosker, Christian Theobalt, Justus Thies

International Conference on Computer Vision (ICCV), 2023 | October 2023

Publication

Spatio-Temporal Windowing for Encoding Perceptually Salient Early Reflections in Parametric Spatial Audio Rendering

Tobias Jüterbock, Fabian Brinkmann, Hannes Gamper, Nikunj Raghuvanshi, Stefan Weinzierl

Journal of the Audio Engineering Society | October 2023, Vol 71(10)

Career Opportunity

Undergraduate Research Internship – Computing

Posted: September 28, 2023

Location: United States

Research Area(s): Algorithms, Artificial intelligence, Audio and Acoustics, Computer vision, Data platforms and analytics, Ecology and environment, Economics, Graphics and multimedia, Hardware and devices, Human language technologies, Human-computer interaction, Mathematics, Medical, health and genomics, Programming languages and software engineering, Search and information retrieval, Security, privacy, and cryptography, Social sciences, Systems and networking

This program is for candidates who are passionate about technology and offer diverse perspectives. We don’t just value differences, we seek them out. Interns put inquiry and theory into practice. Alongside doctoral interns, and some…

Video

Final intern talk: Improving Frechet Audio Distance for Generative Music Evaluation

September 22, 2023

As generative music models become more powerful and popular, there is a growing need for robust objective metrics of music quality that correlates with human perception. The Frechet Audio Distance (FAD) is a commonly used…

41:10

Publication

ICASSP 2023 Acoustic Echo Cancellation Challenge

Ross Cutler, Ando Saabas, Tanel Parnamaa, Marju Purin, Evgenii Indenbom, Nicolae Catalin Ristea, Jegor Guzvin, Hannes Gamper, Sebastian Braun, Robert Aichner

IEEE Open Journal of Signal Processing | September 2023, Vol 5: pp. 675-685

Publication

Large-Scale Automatic Audiobook Creation

Brendan Walsh, Mark Hamilton, Greg Newby, Xi Wang, Serena Ruan, Sheng Zhao, Lei He, Shaofei Zhang, Eric Dettinger, William T. Freeman, Markus Weimer

Interspeech Show and Tell | September 2023

Project

Video

HyWay: Physical Walk (MSR India – TAB Feb 2023)

August 21, 2023

A key aspect of attending such an event in person is being able to experience the setting in its fullness — hearing the buzz of background conversations and seeing who is around. This can be…

0:27