Show simple item record

dc.contributor.authorSheth, Mallory
dc.contributor.authorGerovitch, Albert S.
dc.contributor.authorWelsch, Roy E.
dc.date.accessioned2020-03-31T18:58:56Z
dc.date.available2020-03-31T18:58:56Z
dc.date.issued2019-10-11
dc.identifier.issn1932-6203
dc.identifier.urihttps://hdl.handle.net/1721.1/124462
dc.description.abstractIn many data classification problems, a number of methods will give similar accuracy. However, when working with people who are not experts in data science such as doctors, lawyers, and judges among others, finding interpretable algorithms can be a critical success factor. Practitioners have a deep understanding of the individual input variables but far less insight into how they interact with each other. For example, there may be ranges of an input variable for which the observed outcome is significantly more or less likely. This paper describes an algorithm for automatic detection of such thresholds, called the Univariate Flagging Algorithm (UFA). The algorithm searches for a separation that optimizes the difference between separated areas while obtaining a high level of support. We evaluate its performance using six sample datasets and demonstrate that thresholds identified by the algorithm align well with published results and known physiological boundaries. We also introduce two classification approaches that use UFA and show that the performance attained on unseen test data is comparable to or better than traditional classifiers when confidence intervals are considered. We identify conditions under which UFA performs well, including applications with large amounts of missing or noisy data, applications with a large number of inputs relative to observations, and applications where incidence of the target is low. We argue that ease of explanation of the results, robustness to missing data and noise, and detection of low incidence adverse outcomes are desirable features for clinical applications that can be achieved with relatively simple classifier, like UFA.en_US
dc.language.isoen
dc.publisherPublic Library of Science (PLoS)en_US
dc.relation.isversionof10.1371/journal.pone.0223161en_US
dc.rightsCreative Commons Attribution 4.0 International licenseen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourcePLoSen_US
dc.subjectGeneral Biochemistry, Genetics and Molecular Biologyen_US
dc.subjectGeneral Agricultural and Biological Sciencesen_US
dc.subjectGeneral Medicineen_US
dc.titleThe Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modelingen_US
dc.typeArticleen_US
dc.identifier.citationSheth, Mallory et al. "The Univariate Flagging Algorithm (UFA): An interpretable approach for predictive modeling." PloS one 14 (2019)en_US
dc.contributor.departmentSloan School of Managementen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.relation.journalPloS oneen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2020-02-11T13:06:01Z
dspace.date.submission2020-02-11T13:06:03Z
mit.journal.volume14en_US
mit.journal.issue10en_US
mit.licensePUBLISHER_CC
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record