Show simple item record

dc.contributor.authorServan-Schreiber, Sacha
dc.contributor.authorRiondato, Matteo
dc.contributor.authorZgraggen, Emanuel
dc.date.accessioned2021-09-20T17:30:18Z
dc.date.available2021-09-20T17:30:18Z
dc.date.issued2019-08-20
dc.identifier.urihttps://hdl.handle.net/1721.1/131797
dc.description.abstractAbstract We present ProSecCo, an algorithm for the progressive mining of frequent sequences from large transactional datasets: It processes the dataset in blocks and it outputs, after having analyzed each block, a high-quality approximation of the collection of frequent sequences. ProSecCo can be used for interactive data exploration, as the intermediate results enable the user to make informed decisions as the computation proceeds. These intermediate results have strong probabilistic approximation guarantees and the final output is the exact collection of frequent sequences. Our correctness analysis uses the Vapnik–Chervonenkis (VC) dimension, a key concept from statistical learning theory. The results of our experimental evaluation of ProSecCo on real and artificial datasets show that it produces fast-converging high-quality results almost immediately. Its practical performance is even better than what is guaranteed by the theoretical analysis, and ProSecCo can even be faster than existing state-of-the-art non-progressive algorithms. Additionally, our experimental results show that ProSecCo uses a constant amount of memory, and orders of magnitude less than other standard, non-progressive, sequential pattern mining algorithms.en_US
dc.publisherSpringer Londonen_US
dc.relation.isversionofhttps://doi.org/10.1007/s10115-019-01393-8en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceSpringer Londonen_US
dc.titleProSecCo: progressive sequence mining with convergence guaranteesen_US
dc.typeArticleen_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2020-09-24T20:42:09Z
dc.language.rfc3066en
dc.rights.holderSpringer-Verlag London Ltd., part of Springer Nature
dspace.embargo.termsY
dspace.date.submission2020-09-24T20:42:09Z
mit.licensePUBLISHER_POLICY
mit.metadata.statusAuthority Work and Publication Information Needed


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record