Addressing two issues in machine learning : interpretability and dataset shift

Wang, Fulton.

dc.contributor.advisor	Cynthia Rudin.	en_US
dc.contributor.author	Wang, Fulton.	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2019-11-12T17:40:25Z
dc.date.available	2019-11-12T17:40:25Z
dc.date.copyright	2018	en_US
dc.date.issued	2018	en_US
dc.identifier.uri	https://hdl.handle.net/1721.1/122870
dc.description	Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 71-77).	en_US
dc.description.abstract	In this thesis, I create solutions to two problems. In the first, I address the problem that many machine learning models are not interpretable, by creating a new form of classifier, called the Falling Rule List. This is a decision list classifier where the predicted probabilities are decreasing down the list. Experiments show that the gain in interpretability need not be accompanied by a large sacrifice in accuracy on real world datasets. I then briefly discuss possible extensions that allow one to directly optimize rank statistics over rule lists, and handle ordinal data. In the second, I address a shortcoming of a popular approach to handling covariate shift, in which the training distribution and that for which predictions need to be made have different covariate distributions. In particular, the existing importance weighting approach to handling covariate shift suffers from high variance if the two covariate distributions are very different. I develop a dimension reduction procedure that reduces this variance, at the expense of increased bias. Experiments show that this tradeoff can be worthwhile in some situations.	en_US
dc.description.statementofresponsibility	by Fulton Wang.	en_US
dc.format.extent	77 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Addressing two issues in machine learning : interpretability and dataset shift	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph. D.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.identifier.oclc	1126649834	en_US
dc.description.collection	Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science	en_US
dspace.imported	2019-11-12T17:40:24Z	en_US
mit.thesis.degree	Doctoral	en_US
mit.thesis.department	EECS	en_US

Files in this item

Name:: 1126649834-MIT.pdf
Size:: 18.20Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record