Show simple item record

dc.contributor.authorBaykal, Cenk
dc.contributor.authorRus, Daniela L
dc.date.accessioned2021-04-13T11:38:34Z
dc.date.available2021-04-13T11:38:34Z
dc.date.issued2020-10
dc.identifier.issn0302-9743
dc.identifier.urihttps://hdl.handle.net/1721.1/130461
dc.description.abstractWe present an efficient coreset construction algorithm for large-scale Support Vector Machine (SVM) training in Big Data and streaming applications. A coreset is a small, representative subset of the original data points such that a models trained on the coreset are provably competitive with those trained on the original data set. Since the size of the coreset is generally much smaller than the original set, our preprocess-then-train scheme has potential to lead to significant speedups when training SVM models. We prove lower and upper bounds on the size of the coreset required to obtain small data summaries for the SVM problem. As a corollary, we show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings. We evaluate the performance of our algorithm on real-world and synthetic data sets. Our experimental results reaffirm the favorable theoretical properties of our algorithm and demonstrate its practical effectiveness in accelerating SVM training.en_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Awards 1723943 and 1526815)en_US
dc.description.sponsorshipUnited States. Office of Naval Research (Grant N00014-18-1-2830)en_US
dc.language.isoen
dc.publisherSpringer International Publishingen_US
dc.relation.isversionof10.1007/978-3-030-59267-7_25en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alikeen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/en_US
dc.sourcearXiven_US
dc.titleOn Coresets for Support Vector Machinesen_US
dc.typeArticleen_US
dc.identifier.citationTukan, Murad et al. “On Coresets for Support Vector Machines.” Paper in the Lecture Notes in Computer Science, 12337 LNCS, International Conference on Theory and Applications of Models of Computation (TAMC 2020), Changsha, China, 18-20 Oct, 2020, Springer International Publishing: 287-299 © 2020 The Author(s)en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.relation.journalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)en_US
dc.eprint.versionOriginal manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2021-04-05T14:55:13Z
dspace.orderedauthorsTukan, M; Baykal, C; Feldman, D; Rus, Den_US
dspace.date.submission2021-04-05T14:55:14Z
mit.journal.volume12337 LNCSen_US
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Needed


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record