The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling

Sheth, Mallory; Gerovitch, Albert S.; Welsch, Roy E.

dc.contributor.author	Sheth, Mallory
dc.contributor.author	Gerovitch, Albert S.
dc.contributor.author	Welsch, Roy E.
dc.date.accessioned	2020-03-31T18:58:56Z
dc.date.available	2020-03-31T18:58:56Z
dc.date.issued	2019-10-11
dc.identifier.issn	1932-6203
dc.identifier.uri	https://hdl.handle.net/1721.1/124462
dc.description.abstract	In many data classification problems, a number of methods will give similar accuracy. However, when working with people who are not experts in data science such as doctors, lawyers, and judges among others, finding interpretable algorithms can be a critical success factor. Practitioners have a deep understanding of the individual input variables but far less insight into how they interact with each other. For example, there may be ranges of an input variable for which the observed outcome is significantly more or less likely. This paper describes an algorithm for automatic detection of such thresholds, called the Univariate Flagging Algorithm (UFA). The algorithm searches for a separation that optimizes the difference between separated areas while obtaining a high level of support. We evaluate its performance using six sample datasets and demonstrate that thresholds identified by the algorithm align well with published results and known physiological boundaries. We also introduce two classification approaches that use UFA and show that the performance attained on unseen test data is comparable to or better than traditional classifiers when confidence intervals are considered. We identify conditions under which UFA performs well, including applications with large amounts of missing or noisy data, applications with a large number of inputs relative to observations, and applications where incidence of the target is low. We argue that ease of explanation of the results, robustness to missing data and noise, and detection of low incidence adverse outcomes are desirable features for clinical applications that can be achieved with relatively simple classifier, like UFA.	en_US
dc.language.iso	en
dc.publisher	Public Library of Science (PLoS)	en_US
dc.relation.isversionof	10.1371/journal.pone.0223161	en_US
dc.rights	Creative Commons Attribution 4.0 International license	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	PLoS	en_US
dc.subject	General Biochemistry, Genetics and Molecular Biology	en_US
dc.subject	General Agricultural and Biological Sciences	en_US
dc.subject	General Medicine	en_US
dc.title	The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling	en_US
dc.type	Article	en_US
dc.identifier.citation	Sheth, Mallory et al. "The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling." PloS one 14 (2019)	en_US
dc.contributor.department	Sloan School of Management	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.relation.journal	PloS one	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2020-02-11T13:06:01Z
dspace.date.submission	2020-02-11T13:06:03Z
mit.journal.volume	14	en_US
mit.journal.issue	10	en_US
mit.license	PUBLISHER_CC
mit.metadata.status	Complete

Files in this item

Name:: journal.pone.0223161.pdf
Size:: 1.344Mb
Format:: PDF
Description:: Published version

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record