MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Theory-constrained Data-driven Model Selection, Specification, and Estimation: Applications in Discrete Choice Models

Author(s)
Aboutaleb, Youssef Medhat
Thumbnail
DownloadThesis PDF (67.77Mb)
Advisor
Ben-Akiva, Moshe
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
This thesis provides a framework, along with demonstrated applications, for carefully bringing data-driven flexibility to the specification and model selection of discrete choice models; while, at the same time, maintaining usability for analysis. Assumptions brought to bear under the classical theory-based paradigm enjoy varying degrees of credibility. Some are rooted in economic theory (e.g., utility maximizing behavior) or in information available to the scientist on the data generating process (e.g., exogeneity). These assumptions can be argued to be highly credible. Others are driven by convenience, convention, pursuit of smaller standard errors, or an otherwise lack of systematic specification and model selection process (e.g., restrictive functional and distributional forms, and trial-and-error specification testing). These assumptions are arguably less credible. Our goal is to overcome some of the arbitrary specification and model selection practices that undermine credibility. To this end, theory-constrained data-driven flexibility in specification is introduced to discrete choice models through an optimization framework. Systematic data-driven methods for model selection are used to enhance replicability. The introduced flexibility is constrained to guarantee trustworthiness of predictions through consistency with theory. At the same time, the imposed constraints are validated through hypothesis tests to maintain credibility. The framework we introduce well positions us to realize synergies between the data-driven and theory-based paradigms. The starting point for our approach is discrete choice models with well-established theoretical underpinnings that facilitate causal and behavioral interpretations. Discrete choice models consistent with random utility maximization, for example, are tethered to microeconomics and enable sound economic and welfare valuations. Further, the entire machinery of econometrics remains applicable to address endogeneity issues. This is in contrast to emerging trends in the literature that start with data-driven classifiers in pursuit of predictive gains, and then, as an afterthought, attempt to reconcile output with theory. We provide applications of our proposed framework in addressing specification aspects of both the systematic and stochastic components of discrete choice models. Specialized solution algorithms are developed for each application– leveraging some of the latest advances in mixed-integer and conic optimization (for classical estimation) and in Markov Chain Monte Carlo methods (for Bayesian inference). The methods developed are tested for consistency using synthetic data and applied to empirical data.
Date issued
2022-02
URI
https://hdl.handle.net/1721.1/143299
Department
Massachusetts Institute of Technology. Department of Civil and Environmental Engineering
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.