Detecting and analyzing bursty events on Twitter

Kung, Pau Perng-Hwa

dc.contributor.advisor	Deb Roy.	en_US
dc.contributor.author	Kung, Pau Perng-Hwa	en_US
dc.contributor.other	Program in Media Arts and Sciences (Massachusetts Institute of Technology)	en_US
dc.date.accessioned	2017-03-20T19:39:58Z
dc.date.available	2017-03-20T19:39:58Z
dc.date.copyright	2016	en_US
dc.date.issued	2016	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/107558
dc.description	Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2016.	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 69-74).	en_US
dc.description.abstract	This thesis presents BurstMapper, a system for detecting and characterizing bursts of tweets generated by multiple sources in order to understand interactions between Twitter users and the role of exogenous events (not directly observable on Twitter) in driving tweets. The first stage of the system finds temporal clusters, or bursts of tweets. The second stage characterizes bursts along two dimensions, semantic coherence and causal influence. Semantic coherence measures the semantic relatedness of the tweets in a burst to each other based on a deep neural network derived embedding of tweet contents. Causal influence measures the potential causal interaction between Twitter users using the Hawkes process model. We introduce an annotated corpus of 7,220 tweets produced by five leading candidates in the 2016 U.S. presidential election. Evaluating the system on the annotated corpus shows that with a precision of 75%, tweets caused clearly by specific exogenous events (or responsive tweets hereafter) are detected by the burst detector components of our system. Furthermore, experiments show that the linear combination of semantic coherence and causal influence are predictive of the presence of responsive tweets in a burst, with the Fl-score of 0.76. Examining bursts along the two dimensions reveals that (i) the measures are positively correlated with each other (corr=0.33, p<0.001), (ii) the measures allow us to understand how candidates tend to respond differently to exogenous events, e.g., by attacking opponents or making plan announcements, and (iii) the measures can be used to describe the influence dynamics between candidates over time. Plotting the bursts from a corpus of 1,470 Twitter accounts (the five leading candidates and the users followed by them) shows visual evidence that some user groups (e.g., campaign staffs, journalists, etc.) have a higher levels of semantic coherence and causal interactions. These experiments suggest that the bursts detected by our system provide a useful level of abstraction that summarizes tweet content, providing a solution for coping with massive amount of data on Twitter.	en_US
dc.description.statementofresponsibility	by Pau Perng-Hwa Kung.	en_US
dc.format.extent	74 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Program in Media Arts and Sciences ()	en_US
dc.title	Detecting and analyzing bursty events on Twitter	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Program in Media Arts and Sciences (Massachusetts Institute of Technology)	en_US
dc.identifier.oclc	974640408	en_US

Files in this item

Name:: 974640408-MIT.pdf
Size:: 7.490Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record