Neural-embedded discrete choice models
Massachusetts Institute of Technology. Department of Civil and Environmental Engineering.
P. Christopher Zegras, Francisco C. Pereira and Moshe E. Ben-Akiva.
MetadataShow full item record
This dissertation is motivated by the possible value of integrating theory-based discrete choice models (DCM) and data-driven neural networks. How to benefit from the strengths of both is the overarching question. I propose hybrid structures and strategies to flexibly represent taste heterogeneity, reduce potential biases, and improve predictability while keeping model interpretability. Also, I utilize neural networks' training machinery to speed up and scale up the estimation of Latent Class Choice Models (LCCMs). First, I embed neural networks in DCMs to enable flexible representations of taste heterogeneity and enhance prediction accuracy. I propose two neural-embedded choice models: TasteNet-MNL and nonlinear-LCCM. Both models provide a flexible specification of taste as a function of individual characteristics. TasteNet-MNL extends the Multinomial Logit Model (MNL).A feed-forward neural network (TasteNet) is utilized to predict taste parameters as a nonlinear function of individual characteristics. Taste parameters generated by TasteNet are further fed into a parametric logit model to formulate choice probabilities. I demonstrate the effectiveness of this integrated model in capturing nonlinearity in tastes without a priori knowledge. Using synthetic data, TasteNet-MNL is able to recover the underlying utility specification and predict more accurately than some misspecified MNLs and continuous mixed logit models. TasteNet-MNL also provides interpretations close to the ground truth. In an application to a public dataset (Swissmetro), TasteNet-MNL achieves the best out-of-sample prediction accuracy and discovers a broader spectrum of taste variation than the benchmark MNLs with linear utility specifications. Nonlinear-LCCM enriches the class membership model of a typical LCCM.I represent an LCCM by a neural network and add hidden layers with nonlinear transformations to its class membership model. The nonlinearity introduced by the neural network provides a flexible approximation of the mixing distribution for both systematic and random taste heterogeneity. I apply this method to model Swissmetro mode choice. The nonlinear-LCCM outperforms an LCCM with a linear class membership model with respect to the out-of-sample prediction accuracy. Nonlinear-LCCM also provides interpretable taste parameters for each latent class.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Thesis: Ph. D., Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2019Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 131-139).
DepartmentMassachusetts Institute of Technology. Department of Civil and Environmental Engineering
Massachusetts Institute of Technology
Civil and Environmental Engineering.