FlexGP
Author(s)
Veeramachaneni, Kalyan; Arnaldo, Ignacio; Derby, Owen; O’Reilly, Una-May
Download10723_2014_9320_ReferencePDF.pdf (1.379Mb)
PUBLISHER_POLICY
Publisher Policy
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
Terms of use
Metadata
Show full item recordAbstract
We describe FlexGP, the first Genetic Programming system to perform symbolic regression on large-scale datasets on the cloud via massive data-parallel ensemble learning. FlexGP provides a decentralized, fault tolerant parallelization framework that runs many copies of Multiple Regression Genetic Programming, a sophisticated symbolic regression algorithm, on the cloud. Each copy executes with a different sample of the data and different parameters. The framework can create a fused model or ensemble on demand as the individual GP learners are evolving. We demonstrate our framework by deploying 100 independent GP instances in a massive data-parallel manner to learn from a dataset composed of 515K exemplars and 90 features, and by generating a competitive fused model in less than 10 minutes.
Date issued
2014-11Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Laboratory for Information and Decision SystemsJournal
Journal of Grid Computing
Publisher
Springer Netherlands
Citation
Veeramachaneni, Kalyan et al. “FlexGP: Cloud-Based Ensemble Learning with Genetic Programming for Large Regression Problems.” Journal of Grid Computing 13.3 (2015): 391–407.
Version: Author's final manuscript
ISSN
1570-7873
1572-9184