| dc.contributor.advisor | David Sontag. | en_US | 
| dc.contributor.author | Oberst, Michael Karl. | en_US | 
| dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US | 
| dc.date.accessioned | 2020-03-09T18:59:12Z |  | 
| dc.date.available | 2020-03-09T18:59:12Z |  | 
| dc.date.copyright | 2019 | en_US | 
| dc.date.issued | 2019 | en_US | 
| dc.identifier.uri | https://hdl.handle.net/1721.1/124128 | en_US | 
| dc.description | This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. | en_US | 
| dc.description | Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 | en_US | 
| dc.description | Cataloged from student-submitted PDF version of thesis. | en_US | 
| dc.description | Includes bibliographical references (pages 97-102). | en_US | 
| dc.description.abstract | Inspired by a growing interest in applying reinforcement learning (RL) to healthcare, we introduce a procedure for performing qualitative introspection and `debugging' of models and policies. In particular, we make use of counterfactual trajectories, which describe the implicit belief (of a model) of 'what would have happened' if a policy had been applied. These serve to decompose model-based estimates of reward into specific claims about specific trajectories, a useful tool for 'debugging' of models and policies, especially when side information is available for domain experts to review alongside the counterfactual claims. More specically, we give a general procedure (using structural causal models) to generate counterfactuals based on an existing model of the environment, including common models used in model-based RL. We apply our procedure to a pair of synthetic applications to build intuition, and conclude with an application on real healthcare data, introspecting a policy for sepsis management learned in the recently published work of Komorowski et al. (2018). | en_US | 
| dc.description.statementofresponsibility | by Michael Karl Oberst. | en_US | 
| dc.format.extent | 102 pages | en_US | 
| dc.language.iso | eng | en_US | 
| dc.publisher | Massachusetts Institute of Technology | en_US | 
| dc.rights | MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. | en_US | 
| dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US | 
| dc.subject | Electrical Engineering and Computer Science. | en_US | 
| dc.title | Counterfactual policy introspection using structural causal models | en_US | 
| dc.type | Thesis | en_US | 
| dc.description.degree | S.M. | en_US | 
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US | 
| dc.identifier.oclc | 1142635604 | en_US | 
| dc.description.collection | S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science | en_US | 
| dspace.imported | 2020-09-14T18:40:29Z | en_US |