Articulatory-based conversion of foreign accents with deep neural networks Conference Paper uri icon

abstract

  • Copyright 2015 ISCA. We present an articulatory-based method for real-time accent conversion using deep neural networks (DNN). The approach consists of two steps. First, we train a DNN articulatory synthesizer for the non-native speaker that estimates acoustics from contextualized articulatory gestures. Then we drive the DNN with articulatory gestures from a reference native speaker -mapped to the nonnative articulatory space via a Procrustes transform. We evaluate the accent-conversion performance of the DNN through a series of listening tests of intelligibility, voice identity and nonnative accentedness. Compared to a baseline method based on Gaussian mixture models, the DNN accent conversions were found to be 31% more intelligible, and were perceived more native-like in 68% of the cases. The DNN also succeeded in preserving the voice identity of the nonnative speaker.

name of conference

  • Interspeech 2015

published proceedings

  • 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5

author list (cited authors)

  • Aryal, S., & Gutierrez-Osuna, R.

citation count

  • 9

complete list of authors

  • Aryal, Sandesh||Gutierrez-Osuna, Ricardo

publication date

  • January 2015