Show simple item record

dc.contributor.authorRenda, Alex
dc.contributor.authorDing, Yi
dc.contributor.authorCarbin, Michael
dc.date.accessioned2023-11-17T18:44:13Z
dc.date.available2023-11-17T18:44:13Z
dc.date.issued2023-10-16
dc.identifier.issn2475-1421
dc.identifier.urihttps://hdl.handle.net/1721.1/153003
dc.description.abstractProgrammers and researchers are increasingly developing surrogates of programs, models of a subset of the observable behavior of a given program, to solve a variety of software development challenges. Programmers train surrogates from measurements of the behavior of a program on a dataset of input examples. A key challenge of surrogate construction is determining what training data to use to train a surrogate of a given program. We present a methodology for sampling datasets to train neural-network-based surrogates of programs. We first characterize the proportion of data to sample from each region of a program's input space (corresponding to different execution paths of the program) based on the complexity of learning a surrogate of the corresponding execution path. We next provide a program analysis to determine the complexity of different paths in a program. We evaluate these results on a range of real-world programs, demonstrating that complexity-guided sampling results in empirical improvements in accuracy.en_US
dc.publisherACMen_US
dc.relation.isversionofhttps://doi.org/10.1145/3622856en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleTuraco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programsen_US
dc.typeArticleen_US
dc.identifier.citationRenda, Alex, Ding, Yi and Carbin, Michael. 2023. "Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs." Proceedings of the ACM on Programming Languages, 7 (OOPSLA2).
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.relation.journalProceedings of the ACM on Programming Languagesen_US
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2023-11-01T07:57:43Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2023-11-01T07:57:44Z
mit.journal.volume7en_US
mit.journal.issueOOPSLA2en_US
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record