Date of Award
Summer 1985
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Electrical & Computer Engineering
Program/Concentration
Electrical Engineering
Committee Director
Stephen A. Zahorian
Committee Member
Jack Stoughton
Committee Member
Sharad V. Kanetkar
Call Number for Print
Special Collections LD4331.E55J33
Abstract
The objective of this research was to develop a transformation for mapping speech parameters to color parameter. This transformation is done in real-time, and the resulting color parameter are continuously displayed on a color monitor. This visual speech display is to be used as a speech articulation training aid for the deaf. The conversion of speech acoustic signals into speech parameter was accomplished using special -purpose electronics. The real-time conversion of speech parameter to display parameter was controlled by an 8086/8088 microprocessor operating in an S-100 bus structure. The coefficients of the Karhunen-Loeve series expansion of speech power spectra were used to encode speech into a set of parameter called principal-components. Each principal components is obtained as a linear combination of 16 spectral band energies. The focus of this research was to optimize the method for computing principal components for use with the visual speech display and to determine an optimal transformation from principal components to color parameters.
A series of experiments was completed to determine the principal-components basis vectors for both non-normalized and amplitude-normalized speech spectra. These basis vectors, determined from the statistical properties of the continuous speech of both male and female speakers, were found to be relatively speaker independent. In order to restrict the scope of the research to a specific objective, the transformation of speech parameters to color parameters was optimized for vowels. Clustering experiments of vowels in principal-components spaces showed that vowels are more clustered when level-normalized spectral band energies are used to compute principal-components parameters . However, implementation of a set of level-normalized spectral band energies was not feasible with the available hardware, because of the requirements for real-time operation. Therefore, the transformation from vowels to colors was based on the principal-component parameters obtained from non-normalized spectral band energies, although better results are expected if level-normalized spectral band energies are used to calculate the principal components.
A linear transformation was determined such that the three widely separated vowels /a/, as in hod, /i/, as in heed, and /u/, as in who'd, result in the three widely separated colors red, green, and blue respectively. A real-time flow-mode display of color patterns derived from speech sounds was implemented. A preliminary evaluation of the display indicates that many vowel sounds can be reliably identified by their visual display. Although separate transformations can be used for different speakers, a single fixed transformation appears adequate for males, females, and children.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
DOI
10.25777/rszk-8x55
Recommended Citation
Jagharghi, Amir J..
"Color Display of Vowel Spectra as a Training Aid for the Deaf"
(1985). Master of Science (MS), Thesis, Electrical & Computer Engineering, Old Dominion University, DOI: 10.25777/rszk-8x55
https://digitalcommons.odu.edu/ece_etds/378
Included in
Graphics and Human Computer Interfaces Commons, Signal Processing Commons, Software Engineering Commons, Speech and Hearing Science Commons