Date of Award

Fall 1993

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Electrical & Computer Engineering

Program/Concentration

Electrical Engineering

Committee Director

Stephen A. Zahorian

Committee Member

Peter L. Silsbee

Committee Member

S. Nandkumar

Committee Member

Zaki B. Nossair

Call Number for Print

Special Collections LD4331.E55Z53

Abstract

An essential requirement of speech signal processing is to extract information (features or parameters) from the speech signal which encode the information carried by the signal. The objective of this thesis work was to examine and evaluate two feature sets as acoustic correlates for vowel perception. They are formants and DCTCs. Formants are the frequencies of spectral peaks of the speech signal. DCTCs are the Discrete Cosine Transform Coefficients of the magnitude spectrum and are thus features which encode the global spectral shape of speech signal.

There are different opinions regarding which feature set is a more accurate representation for vowels. In fact the parameters most useful for automatic speech classification may not be good acoustic correlates for the perception of speech. Based on the results of Zahorian and Jagharghi (1990,1993), we initially hypothesized that global spectral shape cues are more important to phonological perception of vowels than are formant frequency cues.

The higher-level objective of the study was to determine a feature set based on certain aspects of both formant and global spectral shape theory, which would be good acoustic correlates of vowel perception. We developed and investigated a new algorithm to compute the DCTCs which represents the spectral shape of the envelope of the speech spectrum. It requires only about 10 percent of the Fourier Transform magnitude components as compared to the DCTCs computed by Zahorian and Jagharghi.

Experiments conducted in this thesis work support the hypothesis that formants are insufficient acoustic correlates for vowel perception and that some type of global spectral features are required. The original DCTC features were also found to be lacking as acoustic correlates of perception. However, a modified DCTC computation was formulated which results in more perceptually significant features. These new features also improve automatic vowel classification of noisy speech. Topics for further study are suggested.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

DOI

10.25777/rdh5-nz44

Share

COinS