Robust model selection and outlier detection in linear regressions

McCann, Lauren, Ph. D. Massachusetts Institute of Technology

dc.contributor.advisor	Roy E. Welsch.	en_US
dc.contributor.author	McCann, Lauren, Ph. D. Massachusetts Institute of Technology	en_US
dc.contributor.other	Massachusetts Institute of Technology. Operations Research Center.	en_US
dc.date.accessioned	2007-02-21T13:09:23Z
dc.date.available	2007-02-21T13:09:23Z
dc.date.copyright	2006	en_US
dc.date.issued	2006	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/36222
dc.description	Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2006.	en_US
dc.description	Includes bibliographical references (p. 191-196).	en_US
dc.description.abstract	In this thesis, we study the problems of robust model selection and outlier detection in linear regression. The results of data analysis based on linear regressions are highly sensitive to model choice and the existence of outliers in the data. This thesis aims to help researchers to choose the correct model when their data could be contaminated with outliers, to detect possible outliers in their data, and to study the impact that such outliers have on their analysis. First, we discuss the problem of robust model selection. Many methods for performing model selection were designed with the standard error model ... and least squares estimation in mind. These methods often perform poorly on real world data, which can include outliers. Robust model selection methods aim to protect us from outliers and capture the model that represents the bulk of the data. We review the currently available model selection algorithms (both non-robust and robust) and present five new algorithms. Our algorithms aim to improve upon the currently available algorithms, both in terms of accuracy and computational feasibility. We demonstrate the improved accuracy of our algorithms via a simulation study and a study on a real world data set.	en_US
dc.description.abstract	(cont.) Finally, we discuss the problem of outlier detection. In addition to model selection, outliers can adversely influence many other outcomes of regression-based data analysis. We describe a new outlier diagnostic tool, which we call diagnostic data traces. This tool can be used to detect outliers and study their influence on a variety of regression statistics. We demonstrate our tool on several data sets, which are considered benchmarks in the field of outlier detection.	en_US
dc.description.statementofresponsibility	by Lauren McCann.	en_US
dc.format.extent	196 p.	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582
dc.subject	Operations Research Center.	en_US
dc.title	Robust model selection and outlier detection in linear regressions	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph.D.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Operations Research Center
dc.contributor.department	Sloan School of Management
dc.identifier.oclc	76951197	en_US

Files in this item

Name:: 76951197-MIT.pdf
Size:: 8.180Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record