016: Introducing Pheme, the speech generation model built to scale.

Deep Learning with PolyAI

01/11/24 • 21 min

Send us a text

In this insightful episode, host Kylie Whitehead converses with Dr. Ivan Vulić, a Senior Scientist at PolyAI and a Principal Research Associate at the University of Cambridge. They discuss the development and advantages of 'Pheme', a new, more efficient model for voice generation developed by PolyAI. Unlike existing Text-To-Speech (TTS) models, Pheme is designed to generate more conversational and natural sounding speech, which can be tailored to the unique needs of different businesses and used for brand voices. They also touch on the balance between performance and quality in building conversational systems, and the ethical considerations surrounding voice synthesis.

Follow PolyAI on LinkedIn
Watch this and other episodes of the Deep Learning pod on YouTube

01/11/24 • 21 min

Deep Learning with PolyAI - 016: Introducing Pheme, the speech generation model built to scale.

Transcript

Ivan00:00:00.200

what previous TTS models basically did, they were just training, for example, on audio samples from audio books and acted speech, which all sounded very artificial. And then when you're building conversational assistance we need this lifelike natural conversational experience.

Kylie00:00:21.378

Hello. Welcome to the Deep Learning Podcast. I'm Kylie Whitehead and I'm joined today by Dr. Ivan Vulej, a Senior Scientist at

Generate a badge

Get a badge for your website that links back to this episode

Select type & size

Copy