Talks and presentations

Consolidating Speech Tasks with Spoken Language Models

December 07, 2023

Talk, Australian National University, ANU School of Computing, Canberra ACT, Australia

Guest Lecture: Text To Speech

November 27, 2023

Talk, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA

Talk: Consolidating speech tasks with Spoken Language Models

November 03, 2023

Talk, Nvidia, CA, USA

Abstract: Recent Large Language Models (LLMs) show great improvements in text processing and natural language processing applications. Spoken language modeling in comparison is a very recent research area. Speech, in contrast to text, has many different components - speaker characteristics, emotional cues, pausing, pitch variation, etc. Moreover, speech signals are of longer sequence length than text. In this talk, I will focus on two parts: First, explore the utility for spoken Language Models for speech evaluation. Second, we will discuss how to build a multi-modal voice and text language model for consolidating speech recognition/ synthesis with text and speech continuation tasks.

Towards robust speech generation

September 28, 2023

Talk, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA

SC-2: Inclusive Neural Speech Synthesis (iNSS)

May 25, 2022