Date of Award
Fall 1993
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Electrical & Computer Engineering
Program/Concentration
Electrical Engineering
Committee Director
Stephen A. Zahorian
Committee Member
Peter L. Silsbee
Committee Member
S. Nandkumar
Committee Member
Zaki B. Nossair
Call Number for Print
Special Collections LD4331.E55Z53
Abstract
An essential requirement of speech signal processing is to extract information (features or parameters) from the speech signal which encode the information carried by the signal. The objective of this thesis work was to examine and evaluate two feature sets as acoustic correlates for vowel perception. They are formants and DCTCs. Formants are the frequencies of spectral peaks of the speech signal. DCTCs are the Discrete Cosine Transform Coefficients of the magnitude spectrum and are thus features which encode the global spectral shape of speech signal.
There are different opinions regarding which feature set is a more accurate representation for vowels. In fact the parameters most useful for automatic speech classification may not be good acoustic correlates for the perception of speech. Based on the results of Zahorian and Jagharghi (1990,1993), we initially hypothesized that global spectral shape cues are more important to phonological perception of vowels than are formant frequency cues.
The higher-level objective of the study was to determine a feature set based on certain aspects of both formant and global spectral shape theory, which would be good acoustic correlates of vowel perception. We developed and investigated a new algorithm to compute the DCTCs which represents the spectral shape of the envelope of the speech spectrum. It requires only about 10 percent of the Fourier Transform magnitude components as compared to the DCTCs computed by Zahorian and Jagharghi.
Experiments conducted in this thesis work support the hypothesis that formants are insufficient acoustic correlates for vowel perception and that some type of global spectral features are required. The original DCTC features were also found to be lacking as acoustic correlates of perception. However, a modified DCTC computation was formulated which results in more perceptually significant features. These new features also improve automatic vowel classification of noisy speech. Topics for further study are suggested.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
DOI
10.25777/rdh5-nz44
Recommended Citation
Zhang, Zhongjiang.
"Acoustic Correlates of Vowel Perception as Determined from Synthesis Experiments With Multi-Tone Stimuli"
(1993). Master of Science (MS), Thesis, Electrical & Computer Engineering, Old Dominion University, DOI: 10.25777/rdh5-nz44
https://digitalcommons.odu.edu/ece_etds/582
Included in
Acoustics, Dynamics, and Controls Commons, Computer and Systems Architecture Commons, Signal Processing Commons