Date of Award

Fall 1985

Document Type

Thesis

Department

Electrical & Computer Engineering

Program/Concentration

Electrical Engineering

Committee Director

Stephen A. Zahorian

Committee Member

Jack Stoughton

Committee Member

Joseph Hilbey

Call Number for Print

Special Collections LD4331.E55E33

Abstract

The quality of synthetic speech from Linear Predictive (LP) vocoders is known to be degraded due to the lack of detail in the commonly used pulse/noise excitation model. In this investigation, it was hypothesized that this degradation is due to the lack of precise timing information in the pulses and to the constraint that each short-time segment of excitation be either an impulse train or white noise. Accordingly, more complex excitation models were implemented using precise timing from peaks in the residual and a mixture of pulses and noise. Since the LP residual is known to be the perfect excitation signal for LP vocoders, these models were based on the LP residual. The timing was determined by locating peaks in a lowpass-filtered LP residual energy waveform. In order to determine the approximate mixture of pulses and noise, two methods were explored to separate the periodic and non-periodic components of the residual. One method, based on the assumption that the periodic and non-periodic components are separated in the frequency domain, employed linear filters to separate the two components. The second method, based on the assumption that the components are separated in the time domain, used time-domain techniques and the lowpass-filtered residual energy waveform to separate the two components. The time domain approach proved to be more feasible. Frequency domain models were developed for modeling the periodic pulse-like component and the non-periodic noise-like component such that the spectrum of the combined components would be flat. Listening experiments indicated that the precise timing of the periodic component resulted in improved quality synthetic speech. Improvements in speech quality related to modeling a mixture of pulses and noise in the excitation were much more difficult to obtain.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

DOI

10.25777/gjb3-a764

Share

COinS