Show simple item record

dc.contributor.advisorRamesh Raskar.en_US
dc.contributor.authorWen, Chung-Lin, S.M. Massachusetts Institute of Technologyen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Architecture. Program in Media Arts and Sciences.en_US
dc.date.accessioned2014-11-04T21:35:13Z
dc.date.available2014-11-04T21:35:13Z
dc.date.copyright2014en_US
dc.date.issued2014en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/91417
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2014.en_US
dc.description40en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 71-74).en_US
dc.description.abstractWe develop a novel algorithm based on spectral geometry that summarize a photo collection into a small subset that represents the collection well. While the definition for a good summarization might not be unique, we focus on two metrics in this thesis: representativeness and diversity. By representativeness we mean that the sampled photo should be similar to other photos in the data set. The intuition behind this is that by regarding each photo as a "vote" towards the scene it depicts, we want to include the photos that have high "votes". Diversity is also desirable because repeating the same information is an inefficient use of the few spaces we have for summarization. We achieve these seemingly contradictory properties by applying diversified sampling on the denser part of the feature space. The proposed method uses diffusion distance to measure the distance between any given pair in the dataset. By emphasizing the connectivity of the local neighborhood, we achieve better accuracy compared to previous methods that used the global distance. Heat Kernel Signature (HKS) is then used to separate the denser part and the sparser part of the data. By intersecting the denser part generated by different features, we are able to remove most of the outliers, i.e., photos that have few similar photos in the dataset. Farthest Point Sampling (FPS) is then applied to give a diversified sampling, which produces our final summarization. The method can be applied to any image collection that has a specific topic but also a fair proportion of outliers. One scenario especially motivating us to develop this technique is the Twitter photos of a specific event. Microblogging services have became a major way that people share new information. However, the huge amount of data, the lack of structure, and the highly noisy nature prevent users from effectively mining useful information from it. There are textual data based methods but the absence of visual information makes them less valuable. To the best of our knowledge, this study is the first to address visual data in Twitter event summarization. Our method's output can produce a kind of "crowd-sourced news", useful for journalists as well as the general public. We illustrate our results by summarizing recent Twitter events and comparing them with those generated by metadata such as retweet numbers. Our results are of at least the same quality although produced by a fully automatic mechanism. In some cases, because metadata can be biased by factors such as the number of followers, our results are even better in comparison. We also note that by our initial pilot study, the photos we found with high-quality have little overlap with highly-tweeted photos. That suggests the signal we found is orthogonal to the retweet signal and the two signals can be potentially combined to achieve even better results.en_US
dc.description.statementofresponsibilityby Chung-Lin Wen.en_US
dc.format.extent74 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectArchitecture. Program in Media Arts and Sciences.en_US
dc.titleEvent-centric Twitter photo summarizationen_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)
dc.identifier.oclc893607735en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record