A STUDY OF FEATURE SETS FOR EMOTION RECOGNITION FROM SPEECH SIGNALS
MetadataShow full item record
This thesis focuses on finding useful features for emotion recognition from speech signals. In comparison to the popular openSMILE “emobase” feature set, our proposed method reduced the size of feature space to about 28% yet boosted the recognition rate by 3.3%. Given we are at a point technologically where computing is cheap and fast, and lots of data are available, the approach to solving all sorts of problems is based on sophisticated machine learning techniques to implicitly make sense of data. Yet in this work, we study particular features that are felt to correlate with changes in emotion but have not been commonly selected for emotion recognition tasks. Jitter, shimmer, breathiness, and speaking rate are analyzed and are found to systematically change as a function of emotion. We not only explore these additional acoustic features that help improve the classification performance, but also try to understand the importance of the existing features in improving accuracy. Our results show that using our features together with MFCCs and pitch related features lead to a better performance.