MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

SigPro: Enabling Subject Matter Expert Guidance in Feature Engineering

Author(s)
Xu, Guanpeng Andy
Thumbnail
DownloadThesis PDF (1.127Mb)
Advisor
Veeramachaneni, Kalyan
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
In this thesis, we detail developments to SigPro, a feature engineering library in Python guided by Subject Matter Experts (SMEs). SigPro includes a suite of data processing building blocks, or primitives, as well as an algorithm to combine primitives to form feature engineering pipelines. These pipelines are in turn used to construct features for machine learning. SMEs, through a low-code interface, have several ways to dictate the feature engineering process. First, subject matter experts can construct a feature engineering pipeline for signal data simply by specifying a sequence of data transformations and aggregations (building blocks); SigPro then automatically composes a primitive graph and thus a feature engineering pipeline. Second, subject matter experts can also specify parameters and hyperparameters for each building block through SigPro’s user-friendly API. These methods encourage SMEs to incorporate their domain knowledge through informative feature transformations and carefully chosen parameter values. When existing building blocks fall short, SigPro facilitates efficient development of new primitives. To this end, we streamline the process for the contribution of new primitives while ensuring their seamless integration into existing pipelines. These improvements ensure that SigPro provides an intuitive yet effective solution where subject matter experts can leverage their domain knowledge to generate relevant, explanatory features that can greatly improve the performance of downstream predictive modeling.
Date issued
2024-02
URI
https://hdl.handle.net/1721.1/153867
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.