Date of Award

Spring 2012

Document Type


Degree Name

Doctor of Philosophy (PhD)


Electrical/Computer Engineering

Committee Director

Vijayan K. Asari

Committee Member

Jiang Li

Committee Member

Yuzhong Shen

Committee Member

Jim Leathrum


The study of facial movement and expression has been a prominent area of research since the early work of Charles Darwin. The Facial Action Coding System (FACS), developed by Paul Ekman, introduced the first universal method of coding and measuring facial movement. Human-Computer Interaction seeks to make human interaction with computer systems more effective, easier, safer, and more seamless. Facial expression recognition can be broken down into three distinctive subsections: Facial Feature Localization, Facial Action Recognition, and Facial Expression Classification. The first and most important stage in any facial expression analysis system is the localization of key facial features. Localization must be accurate and efficient to ensure reliable tracking and leave time for computation and comparisons to learned facial models while maintaining real-time performance. Two possible methods for localizing facial features are discussed in this dissertation.

The Active Appearance Model is a statistical model describing an object's parameters through the use of both shape and texture models, resulting in appearance. Statistical model-based training for object recognition takes multiple instances of the object class of interest, or "positive" samples, and multiple "negative" samples, i.e., images that do not contain objects of interest. Viola and Jones present a highly robust real-time face detection system, and a statistically boosted "attentional" detection cascade composed of many weak feature detectors. A basic algorithm for the elimination of unnecessary sub-frames while using Viola-Jones face detection is presented to further reduce image search time.

A real-time emotion detection system is presented which is capable of identifying seven affective states (agreeing, concentrating, disagreeing, interested, thinking, unsure, and angry) from a near-infrared video stream. The Active Appearance Model is used to place 23 landmark points around key areas of the eyes, brows, and mouth. A prioritized binary decision tree then detects, based on the actions of these key points, if one of the seven emotional states occurs as frames pass. The completed system runs accurately and achieves a real-time frame rate of approximately 36 frames per second.

A novel facial feature localization technique utilizing a nested cascade classifier tree is proposed. A coarse-to-fine search is performed in which the regions of interest are defined by the response of Haar-like features comprising the cascade classifiers. The individual responses of the Haar-like features are also used to activate finer-level searches. A specially cropped training set derived from the Cohn-Kanade AU-Coded database is also developed and tested. Extensions of this research include further testing to verify the novel facial feature localization technique presented for a full 26-point face model, and implementation of a real-time intensity sensitive automated Facial Action Coding System.