Learning with structured decision constraints
Author(s)Goh, Chong Yang
Massachusetts Institute of Technology. Operations Research Center.
MetadataShow full item record
This thesis addresses several prediction and estimation problems under structured decision constraints. We consider them in two parts below. Part 1 focuses on supervised learning problems with constrained output spaces. We approach it in two ways. First, we consider an algorithmic framework that is based on minimizing estimated conditional risk functions. With this approach, we first estimate the conditional expected loss (i.e., conditional risk) function by regression, and then minimize it to predict an output. We analyze statistical and computational properties of this approach, and demonstrate empirically that it can adapt better to certain loss functions compared to methods that directly minimize surrogates of empirical risks. Second, we consider a constraint-embedding approach for reducing prediction time. The idea is to express the output constraints in terms of the model parameters, so that computational burdens are shifted from prediction to training. Specifically, we demonstrate how certain logical constraints in multilabel classification, such as implication, transitivity and mutual exclusivity, can be embedded in convex cones under a class of linear structured prediction models. The approach is also applicable to general affine constraints in vector regression tasks. Part 2 concerns the estimation of a rank-based choice model under substitution constraints. Our motivating application is to estimate the primary demand for a bike-share service using censored data of commuters' trips. We model commuter arrivals with a Poisson process and characterize their trip preferences with a probability mass function (PMF) over rankings of origin-destination pairs. Estimating the arrival rate and PMF, however, is challenging due to the factorial growth of the number of rankings. To address this, we reduce the parameter dimension by (i) finding sparse representations efficiently, and (ii) constraining trip substitutions spatially according to the bike-share network. We also derive an iterative estimation procedure based on difference-of-convex programming. Our method is effective in recovering the primary demand and computationally tractable on a city scale, as we demonstrate on a bike-share service in Boston.
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 119-125).
DepartmentMassachusetts Institute of Technology. Operations Research Center.
Massachusetts Institute of Technology
Operations Research Center.