Loading Events
  • This event has passed.

Advancing Speech Processing with End-to-End Modeling and LLM Integration

March 7 @ 6:30 pm - 9:00 pm

Abstract
The field of speech processing is currently dominated by end-to-end (E2E) models, which utilize a single model to optimize directly towards the final objective function rather than optimizing multiple sub-models separately. This trend is particularly notable in automatic speech recognition (ASR). In this talk, we will provide an overview of E2E ASR models and discuss recent advancements from an industry perspective. Subsequently, we will examine the trend of E2E modeling beyond ASR, with applications such as multi-speaker ASR and simultaneous speech translation, where ASR traditionally serves as only one of several components. This trend ultimately unlocks multimodal intelligence by integrating speech capabilities into large language models (LLM). We will highlight the most recent developments in this area, which present unprecedented opportunities for the field.
Speaker(s): Jinyu Li,
Agenda:
6:30 – 7:00 Check-in, networking, food, and drink
7:00 – 8:00 PM – Presentation by Dr. Jinyu Li
8:00 – 8:30 PM – Q & A
Room: 1302, Bldg: Sobrato Campus for Discovery and Innovation Building , Santa Clara University, 500 El Camino Real, Santa Clara, California, United States, 95053, Virtual: https://events.vtools.ieee.org/m/467238