Efficient Robustness and Interpretability in Learning and Data-Driven Decision-Making
Author(s)
Bennouna, Mohammed Amine
DownloadThesis PDF (6.615Mb)
Advisor
Van Parys, Bart P. G.
Terms of use
Metadata
Show full item recordAbstract
As machine learning algorithms are increasingly developed and deployed in high-stakes applications, ensuring their reliability has become crucial. This thesis introduces algorithmic advancements toward reliability in machine learning, emphasizing two critical dimensions: Robustness and Interpretability.
The first part of this thesis focuses on robustness, which guarantees that algorithms deliver stable and predictable performance despite various data uncertainties. We study robustness when learning under diverse sources of data uncertainty, including the fundamental statistical error, as well as data noise and corruption. Our work reveals how these different sources interact and subsequently impact data-driven decisions. We introduce novel distributionally robust optimization approaches, each tailored to specific uncertainty sources. Our findings highlight that protection against one source may increase vulnerability to another. To address this, we develop distributional ambiguity sets that provide holistic robustness against all sources simultaneously. In each setting, we demonstrate that our novel approaches achieve “efficient” robustness, optimally balancing average performance with out-of-sample guarantees. Our novels algorithms are applied to various scenarios, including training robust neural networks, where they significantly outperform existing benchmarks.
The second part of the thesis addresses interpretability, a critical attribute for decision-support tools in high-risk settings, which requires that algorithms provide understandable justifications for their decisions. Our work in this part was motivated by data-driven personalized patient treatment—an increasingly sought-after machine learning application. In this reinforcement learning problem, interpretability is crucial: physicians cannot rely on a black-box algorithm for prescribing treatments. We introduce theoretically the problem of learning the most concise discrete representation of a continuous state-space dynamic system. In the patient treatment setting, this corresponds to identifying treatment groups based on the evolving features of patients under treatment. Surprisingly, we prove theoretically that it is statistically possible to learn the most concise representation of a dynamic system solely from observed historic sample path data. We subsequently develop an algorithm, MRL, which learns such a concise representation, thereby enhancing interpretability and tractability.
Date issued
2024-05Department
Massachusetts Institute of Technology. Operations Research Center; Sloan School of ManagementPublisher
Massachusetts Institute of Technology