The Big Data Newsvendor: Practical Insights from Machine Learning Analysis
Author(s)
Rudin, Cynthia; Vahn, Gah-Yi
DownloadSWP_5032-13_BigDataNewsVendor_7Oct13.pdf (493.1Kb)
Terms of use
Metadata
Show full item recordAbstract
We present a version of the newsvendor problem where one has n observations of p features as well as past demand. We consider both \big data" (p=n = O(1)) as well as small data (p=n = o(1)). For small data, we provide a linear programming machine learning algorithm that yields an asymptotically optimal order quantity. We also derive a generalization bound based on algorithmic stability, which is an upper bound on the expected out-of-sample cost. For big data, we propose a regularized version of the algorithm to address the curse of dimensionality. A generalization bound is derived for this case as well, bounding the out-of-sample cost with a quantity that depends on n and the amount of regularization. We apply the algorithm to analyze the newsvendor cost of nurse sta_ng using data from the emergency room of a large teaching hospital and show that (i) incorporating appropriate features can reduce the out-of-sample cost by up to 23% relative to the featureless Sample Average Approximation approach, and (ii) regularization can automate feature-selection while controlling the out-of-sample cost. By an appropriate choice of the newsvendor underage and overage costs, our results also apply to quantile regression.
Description
A 2/6/2014 revision to this paper is available at http://hdl.handle.net/1721.1/85658.
Date issued
2013-10-16Series/Report no.
MIT Sloan School of Management Working Paper;5036-13
Keywords
big data, newsvendor, machine learning, Sample Average Approximation, statistical learning theory
Collections
The following license files are associated with this item: