Date of Award

Summer 2017

Document Type


Degree Name

Doctor of Philosophy (PhD)


Electrical & Computer Engineering

Committee Director

Jiang Li

Committee Member

Frederic McKenzie

Committee Member

Dean Krusienski

Committee Member

Vishnu Lakdawala


Recognition of emotional state and diagnosis of trauma related illnesses such as posttraumatic stress disorder (PTSD) using speech signals have been active research topics over the past decade. A typical emotion recognition system consists of three components: speech segmentation, feature extraction and emotion identification. Various speech features have been developed for emotional state recognition which can be divided into three categories, namely, excitation, vocal tract and prosodic. However, the capabilities of different feature categories and advanced machine learning techniques have not been fully explored for emotion recognition and PTSD diagnosis. For PTSD assessment, clinical diagnosis through structured interviews is a widely accepted means of diagnosis, but patients are often embarrassed to get diagnosed at clinics. The speech signal based system is a recently developed alternative. Unfortunately,PTSD speech corpora are limited in size which presents difficulties in training complex diagnostic models. This dissertation proposed sparse coding methods and deep belief network models for emotional state identification and PTSD diagnosis. It also includes an additional transfer learning strategy for PTSD diagnosis. Deep belief networks are complex models that cannot work with small data like the PTSD speech database. Thus, a transfer learning strategy was adopted to mitigate the small data problem. Transfer learning aims to extract knowledge from one or more source tasks and apply the knowledge to a target task with the intention of improving the learning. It has proved to be useful when the target task has limited high quality training data. We evaluated the proposed methods on the speech under simulated and actual stress database (SUSAS) for emotional state recognition and on two PTSD speech databases for PTSD diagnosis. Experimental results and statistical tests showed that the proposed models outperformed most state-of-the-art methods in the literature and are potentially efficient models for emotional state recognition and PTSD diagnosis.


In Copyright. URI: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).