Show simple item record

dc.contributor.advisorBarry L. Vercoe.en_US
dc.contributor.authorKim, Youngmoo Een_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Architecture. Program In Media Arts and Sciences.en_US
dc.date.accessioned2011-04-04T16:15:40Z
dc.date.available2011-04-04T16:15:40Z
dc.date.copyright2003en_US
dc.date.issued2003en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/62044
dc.descriptionThesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2003.en_US
dc.descriptionIncludes bibliographical references (p. 109-115).en_US
dc.description.abstractThe singing voice is the oldest and most variable of musical instruments. By combining music, lyrics, and expression, the voice is able to affect us in ways that no other instrument can. As listeners, we are innately drawn to the sound of the human voice, and when present it is almost always the focal point of a musical piece. But the acoustic flexibility of the voice in intimating words, shaping phrases, and conveying emotion also makes it the most difficult instrument to model computationally. Moreover, while all voices are capable of producing the common sounds necessary for language understanding and communication, each voice possesses distinctive features independent of phonemes and words. These unique acoustic qualities are the result of a combination of innate physical factors and expressive characteristics of performance, reflecting an individual's vocal identity. A great deal of prior research has focused on speech recognition and speaker identification, but relatively little work has been performed specifically on singing. There are significant differences between speech and singing in terms of both production and perception. Traditional computational models of speech have focused on the intelligibility of language, often sacrificing sound quality for model simplicity. Such models, however, are detrimental to the goal of singing, which relies on acoustic authenticity for the non-linguistic communication of expression and emotion. These differences between speech and singing dictate that a different and specialized representation is needed to capture the sound quality and musicality most valued in singing.en_US
dc.description.abstract(cont.) This dissertation proposes an analysis/synthesis framework specifically for the singing voice that models the time-varying physical and expressive characteristics unique to an individual voice. The system operates by jointly estimating source-filter voice model parameters, representing vocal physiology, and modeling the dynamic behavior of these features over time to represent aspects of expression. This framework is demonstrated to be useful for several applications, such as singing voice coding, automatic singer identification, and voice transformation.en_US
dc.description.statementofresponsibilityby Youngmoo Edmund Kim.en_US
dc.format.extent115 p.en_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectArchitecture. Program In Media Arts and Sciences.en_US
dc.titleSinging voice analysis/synthesisen_US
dc.typeThesisen_US
dc.description.degreePh.D.en_US
dc.contributor.departmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)
dc.identifier.oclc54910864en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record