Skip to Main content Skip to Navigation
Conference papers

Lip Animation Synthesis: a Unified Framework for Speaking and Laughing Virtual Agent

yu Ding 1, 2 Catherine Pelachaud 1, 2 
1 MM - Multimédia
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract : This paper proposes a unified statistical framework to synthesize speaking and laughing lip animations for virtual agents in real time. Our lip animation synthesis model takes as input the decomposition of a spoken text into phonemes as well as their duration. Our model can be used with synthesized speech. First, Gaussian mixture models (GMMs), called lip shape GMMs, are used to model the relationship between phoneme duration and lip shape from human motion capture data; then an interpolation function is learnt from human motion capture data, which is based on hidden Markov models(HMMs), called HMMs interpolation. In the synthesis step, lipshapeGMMs are used to infer a first lip shape stream from the inputs; then this lip shape stream is smoothed by the learnt HMMs interpolation, to obtain the synthesized lip animation. The effectiveness of the proposed framework is confirmed in the objective evaluation.
Complete list of metadata
Contributor : TelecomParis HAL Connect in order to contact the contributor
Submitted on : Sunday, December 15, 2019 - 12:49:10 PM
Last modification on : Friday, June 3, 2022 - 4:24:03 PM


  • HAL Id : hal-02412183, version 1


yu Ding, Catherine Pelachaud. Lip Animation Synthesis: a Unified Framework for Speaking and Laughing Virtual Agent. FAAVSP - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing, Sep 2015, Vienna, Austria. pp.78-83. ⟨hal-02412183⟩



Record views