A STUDY OF FEATURE SETS FOR EMOTION RECOGNITION FROM SPEECH SIGNALS

Thumbnail Image
Files
Ko_umd_0117N_16820.pdf(1.26 MB)
No. of downloads: 682
Publication or External Link
Date
2015
Authors
Ko, Yi-Chun
Advisor
Espy-Wilson, Carol
Citation
Abstract
This thesis focuses on finding useful features for emotion recognition from speech signals. In comparison to the popular openSMILE “emobase” feature set, our proposed method reduced the size of feature space to about 28% yet boosted the recognition rate by 3.3%. Given we are at a point technologically where computing is cheap and fast, and lots of data are available, the approach to solving all sorts of problems is based on sophisticated machine learning techniques to implicitly make sense of data. Yet in this work, we study particular features that are felt to correlate with changes in emotion but have not been commonly selected for emotion recognition tasks. Jitter, shimmer, breathiness, and speaking rate are analyzed and are found to systematically change as a function of emotion. We not only explore these additional acoustic features that help improve the classification performance, but also try to understand the importance of the existing features in improving accuracy. Our results show that using our features together with MFCCs and pitch related features lead to a better performance.
Notes
Rights