Sloan Working Papers
http://hdl.handle.net/1721.1/1792
2014-09-18T05:46:08ZThe Big Data Newsvendor: Practical Insights from Machine Learning
http://hdl.handle.net/1721.1/85658
The Big Data Newsvendor: Practical Insights from Machine Learning
Rudin, Cynthia; Vahn, Gah-Yi
We investigate the newsvendor problem when one has n observations of p features related to the demand as well as past demands. Both small data (p=n = o(1)) and big data (p=n = O(1)) are considered. For both cases, we propose a machine learning algorithm to solve the problem and derive a tight generalization bound on the expected out-of-sample cost. The algorithms can be extended intuitively to other situations, such as having censored demand data, ordering for multiple, similar items and having a new item with
limited data. We show analytically that our custom-designed, feature-based approach can be better than other data-driven approaches such as Sample Average Approximation (SAA) and separated estimation and optimization (SEO). Our method can also naturally incorporate the operational statistics method. We then apply the algorithms to nurse staffing in a hospital emergency room and show that (i) they can reduce the median out-of-sample cost by up to 46% and 16% compared to SAA and SEO respectively, with statistical significance at 0.01, and (ii) this is achieved either by carefully selecting a small number of features and applying the small data algorithm, or by using a large number of features and using the big data algorithm,
which automates feature-selection.
This is a revision of previously published DSpace entry: http://hdl.handle.net/1721.1/81412.
2014-02-06T00:00:00ZAn Interpretable Stroke Prediction Model using Rules and Bayesian Analysis
http://hdl.handle.net/1721.1/82148
An Interpretable Stroke Prediction Model using Rules and Bayesian Analysis
Letham, Benjamin; Rudin, Cynthia; McCormick, Tyler H.; Madigan, David
We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if...then... statements (for example, if high blood pressure, then stroke) that discretize a high-dimensional, multivariate feature space into a series of simple, readily inter-
pretable decision statements. We introduce a generative model called the Bayesian List Machine which yields a posterior distribution over possible decision lists. It employs a novel prior structure to encourage sparsity. Our experiments show that the Bayesian List Machine has predictive accuracy on par with the current top algorithms
for prediction in machine learning. Our method is motivated by recent developments in personalized medicine, and can be used to produce highly accurate and interpretable medical scoring systems. We demonstrate this by producing an alternative to the CHADS2 score, actively used in clinical practice for estimating the risk of stroke in patients that have atrial brillation. Our model is as interpretable as CHADS2, but more accurate.
2013-11-15T00:00:00ZThe Big Data Newsvendor: Practical Insights from Machine Learning Analysis
http://hdl.handle.net/1721.1/81412
The Big Data Newsvendor: Practical Insights from Machine Learning Analysis
Rudin, Cynthia; Vahn, Gah-Yi
We present a version of the newsvendor problem where one has n observations of p features as well as past demand. We consider both \big data" (p=n = O(1)) as well as small data (p=n = o(1)). For small data, we provide a linear programming machine learning algorithm that yields an asymptotically optimal order quantity. We also derive a generalization bound based on algorithmic stability, which is an upper bound on the expected out-of-sample cost. For big data, we propose a regularized version of the algorithm to address the curse of dimensionality. A generalization bound is derived for this case as well, bounding the out-of-sample cost with a quantity that depends on n and the amount of regularization. We apply the algorithm to analyze the newsvendor cost of nurse sta_ng using data from the emergency room of a large teaching hospital and show that (i) incorporating appropriate features can reduce the out-of-sample cost by up to 23% relative to the featureless Sample Average Approximation approach, and (ii) regularization can automate feature-selection while controlling the out-of-sample cost. By an appropriate choice of the newsvendor underage and overage costs, our results also apply to quantile regression.
A 2/6/2014 revision to this paper is available at http://hdl.handle.net/1721.1/85658.
2013-10-16T00:00:00ZM.EOS: How General Management Matters
http://hdl.handle.net/1721.1/81262
M.EOS: How General Management Matters
Santos, Jose
This paper explores how general management matters and presents the foundation of a theory of general management as a theory of collective performance in the presence of the general manager. It proposes that general management is the disciplined art of creating a collective performance superior to that which would naturally occur. This brings the relation between management and performance to the fore. It tells us why management matters, not just what management is. General management matters because it creates value by improving collective performance relative to what such performance would be in the absence of general management. This paper goes beyond the dominant sequential model of company performance and refutes the notion the general manager is a designer. It describes a conceptual model, M.EOS, which models company performance as the dynamic interaction between environment, organization and strategy (EOS) that M, the general manager, improves by purposefully intervening to shift EOS. Unlike existing models, M.EOS is not a model of the company but a model of company performance. M.EOS can be used to understand what general managers do or can do in order to fulfill their most fundamental responsibility: augmenting collective performance.
2013-07-09T00:00:00Z