News

Microsoft’s VibeVoice is an open-source text-to-speech model for podcast-length, multi-speaker audio that captures the ...
Official code release of "DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation" [AAAI2025] - whwjdqls/DEEPTalk ...
Lyra shows superiority compared with leading omni-models in: Stronger performance: Achieve SOTA results across a variety of speech-centric tasks. More versatile: Support image, video, ...
Creating voice agents just got a whole lot easier, thanks to the OpenAI's latest speech-to-speech model, GPT-Realtime.
Speech-driven 3D facial animation is crucial in computer graphics and vision, with applications in VR, animation, telepresence, film, and gaming. Deep learning has advanced this field, but a lack of ...
A new brain prosthesis can read out inner thoughts in real time, helping people with ALS and brain stem stroke communicate ...
In 3D speech-driven facial animation generation, existing methods commonly employ pre-trained self-supervised audio models as encoders. However, due to the prevalence of phonetically similar syllables ...