Speech driven facial animation Conference Paper uri icon

abstract

  • Copyright 2001 ACM. The results reported in this article are an integral part of a larger project aimed at achieving perceptually realistic animations, including the individualized nuances, of three-dimensional human faces driven by speech. The audiovisual system that has been developed for learning the spatio-temporal relationship between speech acoustics and facial animation is described, including video and speech processing, pattern analysis, and MPEG-4 compliant facial animation for a given speaker. In particular, we propose a perceptual transformation of the speech spectral envelope, which is shown to capture the dynamics of articulatory movements. An efficient nearest-neighbor algorithm is used to predict novel articulatory trajectories from the speech dynamics. The results are very promising and suggest a new way to approach the modeling of synthetic lip motion of a given speaker driven by his/her speech. This would also provide clues toward a more general cross-speaker realistic animation.

name of conference

  • Proceedings of the 2001 workshop on Perceptive user interfaces

published proceedings

  • Proceedings of the 2001 workshop on Perceptive user interfaces

author list (cited authors)

  • Kakumanu, P., Gutierrez-Osuna, R., Esposito, A., Bryll, R., Goshtasby, A., & Garcia, O. N.

citation count

  • 17

complete list of authors

  • Kakumanu, P||Gutierrez-Osuna, R||Esposito, A||Bryll, R||Goshtasby, A||Garcia, ON

publication date

  • January 2001