Show simple item record

dc.contributor.advisorDeb Roy.en_US
dc.contributor.authorVijayaraghavan, Prashanthen_US
dc.contributor.otherProgram in Media Arts and Sciences (Massachusetts Institute of Technology)en_US
dc.date.accessioned2016-12-22T16:26:42Z
dc.date.available2016-12-22T16:26:42Z
dc.date.copyright2016en_US
dc.date.issued2016en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/106045
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2016.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 97-103).en_US
dc.description.abstractMicroblogging services, most notably Twitter, have become popular avenues to voice opinions and be active participants of discourse on a wide range of topics. As a consequence, Twitter has become an important part of the political battleground that journalists and political analysts can harness to analyze and understand the narratives that organically form, spread and decline among the public in a political campaign. A challenge with social media is that important discussions around certain issues can be overpowered by majoritarian or controversial topics that provoke strong reactions and attract large audiences. In this thesis we develop a method to identify the specific ideas and sentiments that represent the overall conversation surrounding a topic or event as reflected in collections of tweets. We have developed this method in the context of the 2016 US presidential elections. We present and evaluate a large scale data analytics framework, based on recent advances in deep neural networks, for identifying and analyzing election- related conversation on Twitter on a continuous, longitudinal basis in order to identify representative tweets across prominent election issues. The framework consists of two main components, (1) a dynamic topic model that identifies all tweets related to election issues using knowledge from news stories and continuous learning of Twitter's evolving vocabulary, (2) a semantic model of tweets called Tweet2vec that generates general purpose tweet embeddings used for identifying representative tweets by robust semantic clustering. The topic model performed with an average F-1 score of 0.90 across 22 different election topics on a manually annotated dataset. Tweet2Vec outperformed state-of-the- art algorithms on widely used semantic relatedness and sentiment classification evaluation tasks. To demonstrate the value of the framework, we analyzed tweets leading up to a primary debate and contrasted the automatically identified representative tweets with those that were actually used in the debate. The system was able to identify tweets that represented more semantically diverse conversations around each of the major election issues, in comparison to those that were presented during the debate. This framework may have a broad range of applications, from enabling exemplar-based methods for understanding the gist of large collections of tweets, extensible perhaps to other forms of short text documents, to providing an input for new forms of data-grounded journalism and debate.en_US
dc.description.statementofresponsibilityby Prashanth Vijayaraghavan.en_US
dc.format.extent103 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectProgram in Media Arts and Sciences ()en_US
dc.titleAutomatic identification of representative content on Twitteren_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)en_US
dc.identifier.oclc964697988en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record