Show simple item record

dc.contributor.authorWang, Tianyu Tom
dc.contributor.authorQuatieri, Thomas F.
dc.date.accessioned2010-04-06T21:01:56Z
dc.date.available2010-04-06T21:01:56Z
dc.date.issued2009-10
dc.identifier.issn1558-7916
dc.identifier.otherINSPEC Accession Number: 10940127
dc.identifier.urihttp://hdl.handle.net/1721.1/53522
dc.description.abstractThis paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our work is inspired by auditory perception and physiological studies implicating the use of pitch dynamics in speech by humans. We develop and assess signal processing schemes aimed at exploiting temporal change of pitch to address the high-pitch formant frequency estimation problem. Specifically, we propose a 2-D analysis framework using 2-D transformations of the time-frequency space. In one approach, we project changing spectral harmonics over time to a 1-D function of frequency. In a second approach, we draw upon previous work of Quatieri and Ezzat , , with similarities to the auditory modeling efforts of Chi , where localized 2-D Fourier transforms of the time-frequency space provide improved source-filter separation when pitch is changing. Our methods show quantitative improvements for synthesized vowels with stationary formant structure in comparison to traditional and homomorphic linear prediction. We also demonstrate the feasibility of applying our methods on stationary vowel regions of natural speech spoken by high-pitch females of the TIMIT corpus. Finally, we show improvements afforded by the proposed analysis framework in formant tracking on examples of stationary and time-varying formant structure.en
dc.description.sponsorshipUnited States. Dept. of Defense (Air Force Contract FA8721 05 C 0002)en
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineersen
dc.relation.isversionofhttp://dx.doi.org/10.1109/tasl.2009.2024732en
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en
dc.sourceIEEEen
dc.subjecttemporal change of pitchen
dc.subjectspectrotemporal analysisen
dc.subjectlinear predictionen
dc.subjecthigh-pitch effectsen
dc.subjectformant estimationen
dc.titleHigh-Pitch Formant Estimation by Exploiting Temporal Change of Pitchen
dc.typeArticleen
dc.identifier.citationWang, T.T., and T.F. Quatieri. “High-Pitch Formant Estimation by Exploiting Temporal Change of Pitch.” Audio, Speech, and Language Processing, IEEE Transactions on 18.1 (2010): 171-186. © 2009 Institute of Electrical and Electronics Engineers.en
dc.contributor.departmentHarvard University--MIT Division of Health Sciences and Technologyen_US
dc.contributor.departmentHarvard University--MIT Division of Health Sciences and Technologyen_US
dc.contributor.departmentLincoln Laboratoryen_US
dc.contributor.approverQuatieri, Thomas F.
dc.contributor.mitauthorWang, Tianyu Tom
dc.contributor.mitauthorQuatieri, Thomas F.
dc.relation.journalIEEE Transactions on Audio, Speech, and Language Processing,en
dc.eprint.versionFinal published versionen
dc.type.urihttp://purl.org/eprint/type/JournalArticleen
eprint.statushttp://purl.org/eprint/status/PeerRevieweden
dspace.orderedauthorsWang, T.T.; Quatieri, T.F.en
mit.licensePUBLISHER_POLICYen
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record