Date of Award

Fall 2002

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Electrical & Computer Engineering

Program/Concentration

Electrical Engineering

Committee Director

Stephen A. Zahorian

Committee Member

Vijayan Asari

Committee Member

Glenn Gerdin

Call Number for Print

Special Collections LD4331.E55 K38 2002

Abstract

This thesis presents a pitch detection algorithm that is extremely robust for both high quality and telephone speech. The kernel method for this algorithm is the Normalized Cross Correlation (NCCF) reported by David Talkin [16]. Major innovations include: processing of the original acoustic signal and a nonlinearly processed version of the signal to partially restore very weak F0 components; intelligent peak picking to select multiple F0 candidates and assign merit factors; and, incorporation of highly robust pitch contours obtained from smoothed versions of low frequency portions of spectrograms. Dynamic programming is used to find the ''best" pitch track among all the candidates, using both local and transition costs. The algorithm has been evaluated using the Keele pitch extraction reference database as "ground truth" for both "high quality" and ''telephone" speech. For both types of speech, the error rates obtained are lower than the lowest reported in the literature.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

DOI

10.25777/r75x-mk16

Share

COinS