Classification of Bisyllabic Lexical Stress Patterns in Disordered Speech Using Deep Learning
- Additional Document Info
- View All
© 2016 IEEE. Technology-based therapy tools can be of great benefit to children with developmental speech disabilities as they typically require sustained practice with a speech therapist for several years. Towards this aim, over the past 4 years we have developed speech processing tools to automatically detect common errors in disordered speech. This paper presents an automated technique to identify incorrect lexical stress. Specifically, we describe a deep neural network (DNN) that can be used to classify the four different bisyllabic stress patterns: strong-weak (SW), weak-strong (WS), strong-strong (SS) and weak-weak (WW). We derive input features for the DNN from the duration, pitch, intensity and spectral energy on each of the two consecutive syllables. Using these features, we achieve 93% correct classification between SW/WS stress patterns and 88% correct classification of the four bisyllabic patterns on speech from typically developing children, while we obtain 73.4% classification between SW/WS in disordered speech. These figures represent a two-fold reduction in error rates compared to our prior work, which used a DNN with differential features from consecutive syllables.
author list (cited authors)
Shahin, M., Gutierrez-Osuna, R., & Ahmed, B.