MIT Libraries homeMIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Sublinear-Time Algorithms for Counting Star Subgraphs via Edge Sampling

Author(s)
Aliakbarpour, Maryam; Biswas, Amartya Shankha; Gouleakis, Themistoklis; Peebles, John Lee Thompson; Yodpinyanee, Anak; Rubinfeld, Ronitt; ... Show more Show less
Thumbnail
Download453_2017_287_ReferencePDF.pdf (376.8Kb)
OPEN_ACCESS_POLICY

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/
Metadata
Show full item record
Abstract
We study the problem of estimating the value of sums of the form S[subscript p]≜∑([x[subscript i] over p]) when one has the ability to sample x[subscript i]≥0 with probability proportional to its magnitude. When p=2 , this problem is equivalent to estimating the selectivity of a self-join query in database systems when one can sample rows randomly. We also study the special case when {x[subscript i]} is the degree sequence of a graph, which corresponds to counting the number of p-stars in a graph when one has the ability to sample edges randomly. Our algorithm for a (1 ± ε) -multiplicative approximation of S[subscript p] has query and time complexities O(mloglogn/ϵ[superscript 2]S[superscript 1/p][subscript p]). Here, m=∑x[subscript i]/2 is the number of edges in the graph, or equivalently, half the number of records in the database table. Similarly, n is the number of vertices in the graph and the number of unique values in the database table. We also provide tight lower bounds (up to polylogarithmic factors) in almost all cases, even when {x[subscript i]} is a degree sequence and one is allowed to use the structure of the graph to try to get a better estimate. We are not aware of any prior lower bounds on the problem of join selectivity estimation. For the graph problem, prior work which assumed the ability to sample only vertices uniformly gave algorithms with matching lower bounds (Gonen et al. in SIAM J Comput 25:1365–1411, 2011). With the ability to sample edges randomly, we show that one can achieve faster algorithms for approximating the number of star subgraphs, bypassing the lower bounds in this prior work. For example, in the regime where S[subscript p]≤n , and p=2 , our upper bound is [~ over O](n/S[superscript 1/2][subscript p]), in contrast to their Ω(n/S[superscript 1/3][subscript p]) lower bound when no random edge queries are available. In addition, we consider the problem of counting the number of directed paths of length two when the graph is directed. This problem is equivalent to estimating the selectivity of a join query between two distinct tables. We prove that the general version of this problem cannot be solved in sublinear time. However, when the ratio between in-degree and out-degree is bounded—or equivalently, when the ratio between the number of occurrences of values in the two columns being joined is bounded—we give a sublinear time algorithm via a reduction to the undirected case. Keywords: Subgraphs, Approximate counting, Randomized algorithms, Sublinear-time algorithms
Date issued
2017-02
URI
http://hdl.handle.net/1721.1/115241
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
Algorithmica
Publisher
Springer US
Citation
Aliakbarpour, Maryam, et al. “Sublinear-Time Algorithms for Counting Star Subgraphs via Edge Sampling.” Algorithmica, vol. 80, no. 2, Feb. 2018, pp. 668–97.
Version: Author's final manuscript
ISSN
0178-4617
1432-0541

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries homeMIT Libraries logo

Find us on

Twitter Facebook Instagram YouTube RSS

MIT Libraries navigation

SearchHours & locationsBorrow & requestResearch supportAbout us
PrivacyPermissionsAccessibility
MIT
Massachusetts Institute of Technology
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.