Media Summary: Learn in-demand Machine Learning skills now → Learn about watsonx → Large ... This video is an overview of the paper "A Survey on In this AI Research Roundup episode, Alex discusses the paper: 'StepAudio 2.5 Technical Report' StepAudio 2.5 is a single, ...

Speech Llms Models That Listen - Detailed Analysis & Overview

Learn in-demand Machine Learning skills now → Learn about watsonx → Large ... This video is an overview of the paper "A Survey on In this AI Research Roundup episode, Alex discusses the paper: 'StepAudio 2.5 Technical Report' StepAudio 2.5 is a single, ... Imagine having realtime conversations with This paper introduces LLaSM, a large multi-modal Talk 11 of the Conversational AI Reading Group about "Moshi: a

In this AI Research Roundup episode, Alex discusses the paper: 'Step-Audio 2 Technical Report' Step-Audio 2 is a new ... Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam ...

Photo Gallery

Speech LLMs: Models that listen and talk back
How Large Language Models Work
Speech LLMs for Understanding - Paper Overview
100% Local AI Speech to Speech with RAG - Low Latency | Mistral 7B, Faster Whisper ++
Large Language Models explained briefly
StepAudio 2.5: Unified Speech & Text LLM
Speech LLMs Can Listen & Speak Simultaneously
AudioPaLM: A Large Language Model That Can Speak and Listen
LLaSM: Large Language and Speech Model
"Moshi: a speech-text foundation model for real-time dialogue" - Alexandre Défossez
Step-Audio 2: An Expressive Voice LLM
EmoVoice: LLM-based Emotional Text-To-Speech Model (Apr 2025)
Sponsored
Sponsored
View Detailed Profile
Speech LLMs: Models that listen and talk back

Speech LLMs: Models that listen and talk back

Try

How Large Language Models Work

How Large Language Models Work

Learn in-demand Machine Learning skills now → https://ibm.biz/BdK65D Learn about watsonx → https://ibm.biz/BdvxRj Large ...

Sponsored
Speech LLMs for Understanding - Paper Overview

Speech LLMs for Understanding - Paper Overview

This video is an overview of the paper "A Survey on

100% Local AI Speech to Speech with RAG - Low Latency | Mistral 7B, Faster Whisper ++

100% Local AI Speech to Speech with RAG - Low Latency | Mistral 7B, Faster Whisper ++

100% Local AI

Large Language Models explained briefly

Large Language Models explained briefly

A light intro to

Sponsored
StepAudio 2.5: Unified Speech & Text LLM

StepAudio 2.5: Unified Speech & Text LLM

In this AI Research Roundup episode, Alex discusses the paper: 'StepAudio 2.5 Technical Report' StepAudio 2.5 is a single, ...

Speech LLMs Can Listen & Speak Simultaneously

Speech LLMs Can Listen & Speak Simultaneously

Imagine having realtime conversations with

AudioPaLM: A Large Language Model That Can Speak and Listen

AudioPaLM: A Large Language Model That Can Speak and Listen

AudioPaLM is a multimodal language

LLaSM: Large Language and Speech Model

LLaSM: Large Language and Speech Model

This paper introduces LLaSM, a large multi-modal

"Moshi: a speech-text foundation model for real-time dialogue" - Alexandre Défossez

"Moshi: a speech-text foundation model for real-time dialogue" - Alexandre Défossez

Talk 11 of the Conversational AI Reading Group about "Moshi: a

Step-Audio 2: An Expressive Voice LLM

Step-Audio 2: An Expressive Voice LLM

In this AI Research Roundup episode, Alex discusses the paper: 'Step-Audio 2 Technical Report' Step-Audio 2 is a new ...

EmoVoice: LLM-based Emotional Text-To-Speech Model (Apr 2025)

EmoVoice: LLM-based Emotional Text-To-Speech Model (Apr 2025)

Title: EmoVoice:

Speech to Text: Fine-Tuning Generative AI for Smarter Conversational AI

Speech to Text: Fine-Tuning Generative AI for Smarter Conversational AI

Ready to become a certified watsonx AI Assistant Engineer? Register now and use code IBMTechYT20 for 20% off of your exam ...