Show simple item record

dc.contributor.advisorAndrew B. Lippman.en_US
dc.contributor.authorVasconcelos, Nuno Miguel Borges de Pinho Cruz deen_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Architecture. Program In Media Arts and Sciences.en_US
dc.date.accessioned2011-05-23T17:51:03Z
dc.date.available2011-05-23T17:51:03Z
dc.date.copyright2000en_US
dc.date.issued2000en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/62947
dc.descriptionThesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2000.en_US
dc.descriptionIncludes bibliographical references (leaves 192-208).en_US
dc.description.abstractThis thesis presents a unified solution to visual recognition and learning in the context of visual information retrieval. Realizing that the design of an effective recognition architecture requires careful consideration of the interplay between feature selection, feature representation, and similarity function, we start by searching for a performance criteria that can simultaneously guide the design of all three components. A natural solution is to formulate visual recognition as a decision theoretical problem, where the goal is to minimize the probability of retrieval error. This leads to a Bayesian architecture that is shown to generalize a significant number of previous recognition approaches, solving some of the most challenging problems faced by these: joint modeling of color and texture, objective guidelines for controlling the trade-off between feature transformation and feature representation, and unified support for local and global queries without requiring image segmentation. The new architecture is shown to perform well on color, texture, and generic image databases, providing a good trade-off between retrieval accuracy, invariance, perceptual relevance of similarity judgments, and complexity. Because all that is needed to perform optimal Bayesian decisions is the ability to evaluate beliefs on the different hypothesis under consideration, a Bayesian architecture is not restricted to visual recognition. On the contrary, it establishes a universal recognition language (the language of probabilities) that provides a computational basis for the integration of information from multiple content sources and modalities. In result, it becomes possible to build retrieval systems that can simultaneously account for text, audio, video, or any other content modalities. Since the ability to learn follows from the ability to integrate information over time, this language is also conducive to the design of learning algorithms. We show that learning is, indeed, an important asset for visual information retrieval by designing both short and long-term learning mechanisms. Over short time scales (within a retrieval session), learning is shown to assure faster convergence to the desired target images. Over long time scales (between retrieval sessions), it allows the retrieval system to tailor itself to the preferences of particular users. In both cases, all the necessary computations are carried out through Bayesian belief propagation algorithms that, although optimal in a decision-theoretic sense, are extremely simple, intuitive, and easy to implement.en_US
dc.description.statementofresponsibilityby Nuno Miguel Borges de Pinho Cruz de Vasconcelos.en_US
dc.format.extent211 leavesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectArchitecture. Program In Media Arts and Sciences.en_US
dc.titleBayesian models for visual information retrievalen_US
dc.typeThesisen_US
dc.description.degreePh.D.en_US
dc.contributor.departmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)
dc.identifier.oclc47854288en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record