Learning with structured decision constraints

Goh, Chong Yang

Author(s)

Goh, Chong Yang

DownloadFull printable version (7.736Mb)

Other Contributors

Massachusetts Institute of Technology. Operations Research Center.

Advisor

Patrick Jaillet.

Terms of use

MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

This thesis addresses several prediction and estimation problems under structured decision constraints. We consider them in two parts below. Part 1 focuses on supervised learning problems with constrained output spaces. We approach it in two ways. First, we consider an algorithmic framework that is based on minimizing estimated conditional risk functions. With this approach, we first estimate the conditional expected loss (i.e., conditional risk) function by regression, and then minimize it to predict an output. We analyze statistical and computational properties of this approach, and demonstrate empirically that it can adapt better to certain loss functions compared to methods that directly minimize surrogates of empirical risks. Second, we consider a constraint-embedding approach for reducing prediction time. The idea is to express the output constraints in terms of the model parameters, so that computational burdens are shifted from prediction to training. Specifically, we demonstrate how certain logical constraints in multilabel classification, such as implication, transitivity and mutual exclusivity, can be embedded in convex cones under a class of linear structured prediction models. The approach is also applicable to general affine constraints in vector regression tasks. Part 2 concerns the estimation of a rank-based choice model under substitution constraints. Our motivating application is to estimate the primary demand for a bike-share service using censored data of commuters' trips. We model commuter arrivals with a Poisson process and characterize their trip preferences with a probability mass function (PMF) over rankings of origin-destination pairs. Estimating the arrival rate and PMF, however, is challenging due to the factorial growth of the number of rankings. To address this, we reduce the parameter dimension by (i) finding sparse representations efficiently, and (ii) constraining trip substitutions spatially according to the bike-share network. We also derive an iterative estimation procedure based on difference-of-convex programming. Our method is effective in recovering the primary demand and computationally tractable on a city scale, as we demonstrate on a bike-share service in Boston.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2018.

Cataloged from PDF version of thesis.

Includes bibliographical references (pages 119-125).

Date issued

2018

URI

http://hdl.handle.net/1721.1/119350

Department

Massachusetts Institute of Technology. Operations Research Center; Sloan School of Management

Publisher

Massachusetts Institute of Technology

Keywords

Operations Research Center.

Collections

Doctoral Theses