Speech Sound Animation

News

Microsoft Research Unveils VibeVoice for Long-Form Speech Synthesis

Microsoft’s VibeVoice is an open-source text-to-speech model for podcast-length, multi-speaker audio that captures the ...

GitHub5d

DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech ... - GitHub

Official code release of "DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation" [AAAI2025] - whwjdqls/DEEPTalk ...

GitHub8d

Lyra: An Efficient and Speech-Centric Framework - GitHub

Lyra shows superiority compared with leading omni-models in: Stronger performance: Achieve SOTA results across a variety of speech-centric tasks. More versatile: Support image, video, ...

OpenAI Just Announced GPT-Realtime, Its Most Advanced Voice AI Model Yet

Creating voice agents just got a whole lot easier, thanks to the OpenAI's latest speech-to-speech model, GPT-Realtime.

IEEE14d

Speech-Driven 3D Facial Animation Based on Diffusion Model

Speech-driven 3D facial animation is crucial in computer graphics and vision, with applications in VR, animation, telepresence, film, and gaming. Deep learning has advanced this field, but a lack of ...

Scientific American23d

New Brain Device Is First to Read Out Inner Speech

A new brain prosthesis can read out inner thoughts in real time, helping people with ALS and brain stem stroke communicate ...

IEEE23d

Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven ...

In 3D speech-driven facial animation generation, existing methods commonly employ pre-trained self-supervised audio models as encoders. However, due to the prevalence of phonetically similar syllables ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results