AMS: Generating AutoML search spaces from weak specifications
Author(s)
Cambronero, JP; Cito, J; Rinard, MC
DownloadPublished version (709.4Kb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
© 2020 Owner/Author. We consider a usage model for automated machine learning (AutoML) in which users can influence the generated pipeline by providing a weak pipeline specification: an unordered set of API components from which the AutoML system draws the components it places into the generated pipeline. Such specifications allow users to express preferences over the components that appear in the pipeline, for example a desire for interpretable components to appear in the pipeline. We present AMS, an approach to automatically strengthen weak specifications to include unspecified complementary and functionally related API components, populate the space of hyperparameters and their values, and pair this configuration with a search procedure to produce a strong pipeline specification: a full description of the search space for candidate pipelines. ams uses normalized pointwise mutual information on a code corpus to identify complementary components, BM25 as a lexical similarity score over the target API's documentation to identify functionally related components, and frequency distributions in the code corpus to extract key hyperparameters and values. We show that strengthened specifications can produce pipelines that outperform the pipelines generated from the initial weak specification and an expert-annotated variant, while producing pipelines that still reflect the user preferences captured in the original weak specification.
Date issued
2020Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
ESEC/FSE 2020 - Proceedings of the 28th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
Publisher
ACM
Citation
Cambronero, JP, Cito, J and Rinard, MC. 2020. "AMS: Generating AutoML search spaces from weak specifications." ESEC/FSE 2020 - Proceedings of the 28th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering.
Version: Final published version