Show simple item record

dc.contributor.advisorDeb Roy.en_US
dc.contributor.authorMcClure, David(David W.)en_US
dc.contributor.otherProgram in Media Arts and Sciences (Massachusetts Institute of Technology)en_US
dc.date.accessioned2019-07-18T20:35:33Z
dc.date.available2019-07-18T20:35:33Z
dc.date.copyright2019en_US
dc.date.issued2019en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/121838
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2019en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 82-84).en_US
dc.description.abstractHow different, in a precise sense, is The New York Times from Fox News? Or - Fox from NPR, NPR from CNN, CNN from Breitbart? If we think of news organizations as producers of language, as "speakers" - how similar or different are the voices? This question of the distance between news sources is fundamental to concerns about fragmentation and polarization in the news ecosystem. A number of studies have measured the proximity between outlets in terms of overlap at the level of audience, and then defined content-level differences in terms of the underlying audience composition - for example, the fraction of the readership who have shared content from particular political candidates. The "content graph" of the news ecosystem - the set of similarities and differences at the level of the actual coverage - is often assumed to be tightly linked to the "audience graph"; and the two are even defined in terms of each other.en_US
dc.description.abstractHow exactly do these two systems interact, though? In many ways, our knowledge of the content graph is less precise than knowledge about the audience graph, which in some ways is simpler to measure. A rich line of work has studied the coverage of specific issues in news content, and recent work has started to systematically survey the content produced by a range of outlets, often by way of unsupervised approaches that characterize differences at the level of topic. Building on this, I attempt to precisely quantify the relative similarities among major media organizations from a standpoint of textual discriminabiliy, focusing on a corpus of 1.2 million article headlines from 15 major US news outlets, extracted from an archive of 73 million links posted on Twitter over a 625-day period running from the beginning of 2017 through the summer of 2018.en_US
dc.description.abstractI formulate the question as a supervised learning problem, in which classifiers are presented with a headline and trained to identify the outlet that produced it. This training objective is used to induce high-quality distributed representations of headlines, and also makes it possible to measure the degree to which different outlets produce similar and dissimilar content. I then contextualize these language-level similarities against two backdrops. First, I examine the degree to which similarities at the level of headlines correlate with similarities at the level of audiences - with specific focus on sites of misalignment, where outlets "speak" in ways that don't match the typical patterns of other outlets that share similar audiences.en_US
dc.description.abstractAmong the news organizations considered in this study, the Associated Press and The Hill are the two most "misaligned" outlets, and we can perhaps look to specific portions of their content as a signal for the types of topics, styles, and stances that might be effective at permeating across axes of political and cultural difference. Second - I study headlines as a historical process. How stable are the linguistic profiles of major news organizations, and to what degree have they evolved into new configurations? I find significant changes over first 18 months of the Trump presidency, with BuzzFeed doubling down on "quiz" articles; Huffington Post moving away from lifestyle content and towards political reporting; The Daily Kos becoming less exclusively focused on politics; and Fox shifting towards a kind of "tabloid" style, with a focus on violent crime, personal misfortune, and socially-charged political issues.en_US
dc.description.statementofresponsibilityDavid McClure.en_US
dc.format.extent84 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectProgram in Media Arts and Sciencesen_US
dc.titleHeadlines as networked language : a study of content and audience across 73 million links on Twitteren_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentProgram in Media Arts and Sciences (Massachusetts Institute of Technology)en_US
dc.identifier.oclc1108636877en_US
dc.description.collectionS.M. Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciencesen_US
dspace.imported2019-07-18T20:35:30Zen_US
mit.thesis.degreeMasteren_US
mit.thesis.departmentMediaen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record