Show simple item record

dc.contributor.advisorMichel X. Goemans.en_US
dc.contributor.authorKim, Chiheonen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Mathematics.en_US
dc.date.accessioned2019-03-01T19:55:55Z
dc.date.available2019-03-01T19:55:55Z
dc.date.copyright2018en_US
dc.date.issued2018en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/120659
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2018.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 205-213).en_US
dc.description.abstractCommunity recovery is a major challenge in data science and computer science. The goal in community recovery is to find the hidden clusters from given relational data, which is often represented as a labeled hyper graph where nodes correspond to items needing to be labeled and edges correspond to observed relations between the items. We investigate the problem of exact recovery in the class of statistical models which can be expressed in terms of graphical channels. In a graphical channel model, we observe noisy measurements of the relations between k nodes while the true labeling is unknown to us, and the goal is to recover the labels correctly. This generalizes both the stochastic block models and spiked tensor models for principal component analysis, which has gained much interest over the last decade. We focus on two aspects of exact recovery: statistical limits and efficient algorithms achieving the statistic limit. For the statistical limits, we show that the achievability of exact recovery is essentially determined by whether we can recover the label of one node given other nodes labels with fairly high probability. This phenomenon was observed by Abbe et al. for generic stochastic block models, and called "local-to-global amplification". We confirm that local-to-global amplification indeed holds for generic graphical channel models, under some regularity assumptions. As a corollary, the threshold for exact recovery is explicitly determined. For algorithmic concerns, we consider two examples of graphical channel models, (i) the spiked tensor model with additive Gaussian noise, and (ii) the generalization of the stochastic block model for k-uniform hypergraphs. We propose a strategy which we call "truncate-and-relax", based on a standard semidefinite relaxation technique. We show that in these two models, the algorithm based on this strategy achieves exact recovery up to a threshold which orderwise matches the statistical threshold. We complement this by showing the limitation of the algorithm.en_US
dc.description.statementofresponsibilityby Chiheon Kim.en_US
dc.format.extent213 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectMathematics.en_US
dc.titleStatistical limits of graphical channel models and a semidefinite programming approachen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematics
dc.identifier.oclc1088419852en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record