Date of Award

Spring 1996

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Electrical & Computer Engineering

Program/Concentration

Electrical Engineering

Committee Director

Peter L. Silsbee

Committee Member

Stephen A. Zahorian

Committee Member

Martin D. Meyer

Call Number for Print

Special Collections LD4331.E55 S82

Abstract

An audiovisual semi-continuous hidden Markov model (HMM)-based Automatic Speech Recognition (ASR) system and an improved method of integrating audio and visual information in an audiovisual discrete HMM-based ASR system are investigated.

In the audiovisual discrete HMM, an adaptive integration formulation is employed, which incorporates the integration into the HMM at a pre-categorical stage. A visual weighting parameter is determined automatically, which allows the relative contribution of audio and visual information to be adjusted adaptively. Using an adaptive weight, the accuracy increased by 13% compared to the same model with no adaptive weight.

The semi-continuous HMM is a class of models which includes both discrete and continuous mixture HMMs as its special forms and unifies vector quantization (VQ) the discrete HMM, and the continuous mixture HMM. It reduces the vector quantization distortion of discrete HMMs by using continuous output probability density functions represented by a combination of the discrete output probabilities of the model and the continuous Gaussian probability density functions (pdfs) associated with each VQ symbol. The parameters of the vector quantization codebook and the HMM can be optimized together to achieve a unified modeling approach. Experimental results show that the recognition performance could be improved significantly in a relatively low signal-to-noise ratio environment by using the semi-continuous HMMs and the accuracy could be increased 9% on average. The modified Gaussian pdfs, which are Gaussian within two standard deviation but have "heavier" Laplacian tails, is first used in the classification procedure of the semicontinuous HMM in this thesis. It is an efficient way to compensate for inadequate training of Gaussian pdfs, or different testing and training environments.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

DOI

10.25777/0jjj-1y69

Share

COinS