Reduction of non-native accents through statistical parametric articulatory synthesis.
Academic Article
Overview
Research
Identity
Additional Document Info
Other
View All
Overview
abstract
This paper presents an articulatory synthesis method to transform utterances from a second language (L2) learner to appear as if they had been produced by the same speaker but with a native (L1) accent. The approach consists of building a probabilistic articulatory synthesizer (a mapping from articulators to acoustics) for the L2 speaker, then driving the model with articulatory gestures from a reference L1 speaker. To account for differences in the vocal tract of the two speakers, a Procrustes transform is used to bring their articulatory spaces into registration. In a series of listening tests, accent conversions were rated as being more intelligible and less accented than L2 utterances while preserving the voice identity of the L2 speaker. No significant effect was found between the intelligibility of accent-converted utterances and the proportion of phones outside the L2 inventory. Because the latter is a strong predictor of pronunciation variability in L2 speech, these results suggest that articulatory resynthesis can decouple those aspects of an utterance that are due to the speaker's physiology from those that are due to their linguistic gestures.