SigPro: Enabling Subject Matter Expert Guidance in Feature Engineering
Author(s)
Xu, Guanpeng Andy
DownloadThesis PDF (1.127Mb)
Advisor
Veeramachaneni, Kalyan
Terms of use
Metadata
Show full item recordAbstract
In this thesis, we detail developments to SigPro, a feature engineering library in Python guided by Subject Matter Experts (SMEs). SigPro includes a suite of data processing building blocks, or primitives, as well as an algorithm to combine primitives to form feature engineering pipelines. These pipelines are in turn used to construct features for machine learning.
SMEs, through a low-code interface, have several ways to dictate the feature engineering process. First, subject matter experts can construct a feature engineering pipeline for signal data simply by specifying a sequence of data transformations and aggregations (building blocks); SigPro then automatically composes a primitive graph and thus a feature engineering pipeline. Second, subject matter experts can also specify parameters and hyperparameters for each building block through SigPro’s user-friendly API. These methods encourage SMEs to incorporate their domain knowledge through informative feature transformations and carefully chosen parameter values.
When existing building blocks fall short, SigPro facilitates efficient development of new primitives. To this end, we streamline the process for the contribution of new primitives while ensuring their seamless integration into existing pipelines. These improvements ensure that SigPro provides an intuitive yet effective solution where subject matter experts can leverage their domain knowledge to generate relevant, explanatory features that can greatly improve the performance of downstream predictive modeling.
Date issued
2024-02Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology