Show simple item record

dc.contributor.advisorDavid Simchi-Levi.en_US
dc.contributor.authorShen, Yingzhenen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Civil and Environmental Engineering.en_US
dc.date.accessioned2015-10-30T18:56:57Z
dc.date.available2015-10-30T18:56:57Z
dc.date.copyright2015en_US
dc.date.issued2015en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/99575
dc.descriptionThesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2015.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 91-93).en_US
dc.description.abstractToday social network websites like Twitter are important information sources for a company's marketing, logistics and supply chain. Sometimes a topic about a product will "explode" at a "peak day," suddenly being talked about by a large number of users. Predicting the diffusion process of a Twitter topic is meaningful for a company to forecast demand, and plan ahead to dispatch its products. In this study, we collected Twitter data on 220 topics, covering a wide range of fields. And we created 12 features for each topic at each time stage, e.g. number of tweets mentioning this topic per hour, number of followers of users already mentioning this topic, and percentage of root tweets among all tweets. The task in this study is to predict the total mention count within the whole time horizon, 180 days, as early and accurately as possible. To complete this task, we applied two models - fitting the curve denoting topic popularity (mention count curve) by Bass diffusion model; and using machine learning models including K-nearest-neighbor, linear regression, bagged tree, and ensemble to learn the topic popularity as a function of the features we created. The results of this study reveal that the Basic Bass model captures the underlying mechanism of the Twitter topic development process. And we can analogue Twitter topics' adoption to a new product's diffusion. Using only mention count, over the whole time horizon, the Bass model has much better predictive accuracy, compared to machine learning models with extra features. However, even with the best model (the Bass model) and focusing on the subset of topics with better predictability, predictive accuracy is still not good enough before the "explosion day." This is because "explosion" is usually triggered by news outside Twitter, and therefore is hard to predict without information outside Twitter.en_US
dc.description.statementofresponsibilityby Yingzhen Shen.en_US
dc.format.extent108 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectCivil and Environmental Engineering.en_US
dc.titleForecasting Twitter topic popularity using bass diffusion model and machine learningen_US
dc.typeThesisen_US
dc.description.degreeS.M. in Transportationen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Civil and Environmental Engineering
dc.identifier.oclc924831552en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record