Improving Efficiency and Fairness in Machine Learning: a Discrete Optimization Approach

Bandi, Hari

Author(s)

Bandi, Hari

DownloadThesis PDF (6.314Mb)

Advisor

Bertsimas, Dimitris

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

In recent years, machine learning models are being increasingly deployed in various applications including Education, Finance, Healthcare, Transportation, etc. However, in most practical situations one-size-fits-all solutions suffer from poor predictive performance and/or bias against certain subgroups. This necessitates developing newer approaches to enhance robustness, interpretability and fairness in the resulting machine learning systems. We borrow tools from discrete and robust optimization to develop models and algorithms for such systems. The first part of this thesis focuses on developing novel methodologies to enhance performance of specific predictive models. In particular, in the first chapter we propose a novel Mixed Integer Optimization (MIO) formulation that optimally recovers the parameters of a Gaussian mixture model (GMM) by minimizing a discrepancy measure (either the Kolmogorov-Smirnov or the Total variation distance) between the empirical distribution function and the distribution function of the GMM whenever the mixture component weights are known. In the second chapter, we present a holistic framework employing tensor completion and robust optimization for prescribing influenza vaccine composition. We also build an optimal classification tree to predict the efficacy of the proposed vaccine in terms of morbidity and mortality rates for different countries. In the second part of the thesis, we present novel algorithms to alleviate systemic bias with respect to gender, race and ethnicity, often unconscious, but prevalent in datasets involving choices made by people. We propose (a) a novel optimization approach based on optimally flipping outcome labels and training classification models simultaneously to discover changes to be made in the selection process so as to achieve diversity without significantly affecting meritocracy, and (b) a novel implementation tool employing optimal classification trees to provide insights on which attributes of individuals lead to flipping of their labels, and to help make changes in the current selection processes in a manner understandable by human decision makers. In the final chapter, we present an application of our work on a discharge disposition prediction problem for trauma patients to debias the dataset with respect to race, and train optimal classification trees to predict discharge decisions for trauma patients with penetrating injuries. Our impact here is two fold: (1) alleviating bias to enhance diversity in discharge decisions and developing an implementation tool using optimal classification trees to promote changes in the selection process, and (2) improving predictive performance (AUC) of the resulting classifiers after debiasing the dataset.

Date issued

2021-09

URI

https://hdl.handle.net/1721.1/139985

Department

Massachusetts Institute of Technology. Operations Research Center

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses