Show simple item record

dc.contributor.advisorDeb Roy.en_US
dc.contributor.authorKung, Pau Perng-Hwaen_US
dc.contributor.otherProgram in Media Arts and Sciences (Massachusetts Institute of Technology)en_US
dc.date.accessioned2017-03-20T19:39:58Z
dc.date.available2017-03-20T19:39:58Z
dc.date.copyright2016en_US
dc.date.issued2016en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/107558
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2016.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 69-74).en_US
dc.description.abstractThis thesis presents BurstMapper, a system for detecting and characterizing bursts of tweets generated by multiple sources in order to understand interactions between Twitter users and the role of exogenous events (not directly observable on Twitter) in driving tweets. The first stage of the system finds temporal clusters, or bursts of tweets. The second stage characterizes bursts along two dimensions, semantic coherence and causal influence. Semantic coherence measures the semantic relatedness of the tweets in a burst to each other based on a deep neural network derived embedding of tweet contents. Causal influence measures the potential causal interaction between Twitter users using the Hawkes process model. We introduce an annotated corpus of 7,220 tweets produced by five leading candidates in the 2016 U.S. presidential election. Evaluating the system on the annotated corpus shows that with a precision of 75%, tweets caused clearly by specific exogenous events (or responsive tweets hereafter) are detected by the burst detector components of our system. Furthermore, experiments show that the linear combination of semantic coherence and causal influence are predictive of the presence of responsive tweets in a burst, with the Fl-score of 0.76. Examining bursts along the two dimensions reveals that (i) the measures are positively correlated with each other (corr=0.33, p<0.001), (ii) the measures allow us to understand how candidates tend to respond differently to exogenous events, e.g., by attacking opponents or making plan announcements, and (iii) the measures can be used to describe the influence dynamics between candidates over time. Plotting the bursts from a corpus of 1,470 Twitter accounts (the five leading candidates and the users followed by them) shows visual evidence that some user groups (e.g., campaign staffs, journalists, etc.) have a higher levels of semantic coherence and causal interactions. These experiments suggest that the bursts detected by our system provide a useful level of abstraction that summarizes tweet content, providing a solution for coping with massive amount of data on Twitter.en_US
dc.description.statementofresponsibilityby Pau Perng-Hwa Kung.en_US
dc.format.extent74 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectProgram in Media Arts and Sciences ()en_US
dc.titleDetecting and analyzing bursty events on Twitteren_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)en_US
dc.identifier.oclc974640408en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record