MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Integrating Gradient Boosting and Generative Models: Hybrid Approach to Address Class Imbalance and Evaluation Gaps in Real-World Systems

Author(s)
Lau, Mary
Thumbnail
DownloadThesis PDF (2.060Mb)
Advisor
Gupta, Amar
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Anomaly detection remains a persistent challenge in machine learning due to the extreme class imbalance, high cost of false negatives, and the need to regulate false positives in realworld settings at scale. This thesis introduces Tail-end FPR Max Recall, a business-aware evaluation framework designed for such constrained environments. Using this framework, we benchmark LightGBM—a gradient boosting method known for its computational efficiency and predictive accuracy—on an imbalanced dataset, comparing its performance against standard academic evaluation criteria. Our results demonstrate that Tail-end FPR Max Recall fills critical gaps left by standard academic criteria, providing a more realistic assessment of model performance that aims to maximize recall while enforcing a false positive rate budget. Beyond benchmarking, we propose two strategies that incorporate deep learning methods to augment the already strong performance of gradient boosting: (1) using generative models to produce synthetic minority-class samples that outperform traditional oversampling techniques, and (2) using neural embeddings to improve feature representation for anomaly detection. Together, these contributions offer a methodology for evaluating and improving anomaly detection pipelines in domains where rare, high-impact events must be detected while meeting strict operational demands.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162707
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.