Show simple item record

dc.contributor.advisorDevavrat Shah.en_US
dc.contributor.authorParthasarathy, Dhruuven_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2014-11-24T18:40:28Z
dc.date.available2014-11-24T18:40:28Z
dc.date.copyright2014en_US
dc.date.issued2014en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/91858
dc.descriptionThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 53-55).en_US
dc.description.abstractCommunities in social interaction networks or graphs are sets of well-connected, and very often overlapping vertices. Formally, we view any maximal clique of the social network graph as a community. The problem of finding maximal cliques is known to be computationally hard. The goal of this work is to identify structural conditions in social network graphs that lead to efficient identification of maximal cliques, i.e. overlapping communities. We propose an evolutionary model called sequential community graphs for community formation in social networks. In a sequential community graph, each node enters the graph by either joining an existing community, or creating its own. To discover communities, i.e. maximal cliques, in such graphs, we present the non-parametric Iterative Leader-Follower Algorithm (ILFA). We establish that the ILFA finds all the communities/maximal cliques correctly in the sequential community graph model in polynomial time in the number of vertices in the graph. To scale to very large data sets, we propose a minor simplification of the ILFA, called the fast leader-follower algorithm (FLFA) which effectively runs in linear time in the input data size, and finds all communities correctly for sequential community graphs with an additional constraint. Empirically, the FLFA and IFLA perform nearly the same in terms of accuracy, but the FLFA runs nearly three orders of magnitude faster. We find that the sequential community graph model is a good fit for a wide variety of social networks where users can be modeled as entering the graph by joining existing communities or creating their own. In such social networks, we demonstrate that the FLFA and ILFA outperform other state of the art algorithms both in terms of speed and accuracy. For example, in the Internet Movie Database (IMDB) graph where communities naturally correspond to actors in the same movie, our algorithms finds nearly all ground truth communities correctly while all other known community detection algorithms do very poorly. Similar empirical results are found for various other social data sets. This supports our hypothesis that we can model many social graphs as sequential community graphs and accurately detect their communities using the ILFA or FLFA.en_US
dc.description.statementofresponsibilityby Dhruuv Parthasarathy.en_US
dc.format.extent55 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleLeaders, followers, and community detectionen_US
dc.title.alternativeFast, non-parametric detection of overlapping communities : the Leader-Follower algorithmen_US
dc.title.alternativeLeader-Follower algorithmen_US
dc.typeThesisen_US
dc.description.degreeM. Eng.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc894353059en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record