Image annotation with discriminative model and annotation refinement by visual similarity matching
Author(s)
Hu, Rong (RongRong)
DownloadFull printable version (5.856Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Edward Chang and Berthold Horn.
Terms of use
Metadata
Show full item recordAbstract
A large percentage of photos on the Internet cannot be reached by search engines because of the absence of textual metadata. Such metadata come from description and tags of the photos by their uploaders. Despite of decades of research, neither model based and model-free approaches can provide quality annotation to images. In this thesis, I present a hybrid annotation pipeline that combines both approaches in hopes of increasing the accuracy of the resulting annotations. Given an unlabeled image, the first step is to suggest some words via a trained model optimized for retrieval of images from text. Though the trained model cannot always provide highly relevant words, they can be used as initial keywords to query a large web image repository and obtain text associated with retrieved images. We then use perceptual features (e.g., color, texture, shape, and local characteristics) to match the retrieved images with the query photo and use visual similarity to rank the relevance of suggested annotations for the query photo.
Description
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009. Cataloged from PDF version of thesis. Includes bibliographical references (p. 65-67).
Date issued
2009Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.