Image annotation with discriminative model and annotation refinement by visual similarity matching

Hu, Rong (RongRong)

dc.contributor.advisor	Edward Chang and Berthold Horn.	en_US
dc.contributor.author	Hu, Rong (RongRong)	en_US
dc.contributor.other	Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2011-02-23T15:02:59Z
dc.date.available	2011-02-23T15:02:59Z
dc.date.copyright	2009	en_US
dc.date.issued	2009	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/61311
dc.description	Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (p. 65-67).	en_US
dc.description.abstract	A large percentage of photos on the Internet cannot be reached by search engines because of the absence of textual metadata. Such metadata come from description and tags of the photos by their uploaders. Despite of decades of research, neither model based and model-free approaches can provide quality annotation to images. In this thesis, I present a hybrid annotation pipeline that combines both approaches in hopes of increasing the accuracy of the resulting annotations. Given an unlabeled image, the first step is to suggest some words via a trained model optimized for retrieval of images from text. Though the trained model cannot always provide highly relevant words, they can be used as initial keywords to query a large web image repository and obtain text associated with retrieved images. We then use perceptual features (e.g., color, texture, shape, and local characteristics) to match the retrieved images with the query photo and use visual similarity to rank the relevance of suggested annotations for the query photo.	en_US
dc.description.statementofresponsibility	by Rong Hu.	en_US
dc.format.extent	67 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Image annotation with discriminative model and annotation refinement by visual similarity matching	en_US
dc.type	Thesis	en_US
dc.description.degree	M.Eng.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	702675924	en_US

Files in this item

Name:: 702675924-MIT.pdf
Size:: 5.856Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record