Automating data extraction from prescription document images to reduce human error
Author(s)Zahray, Lisa(Lisa A.)
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Matt Pokress and George Verghese.
MetadataShow full item record
Manual data entry from a form into a database is a time consuming and error-prone task. In the case of prescription documents, errors are especially important to avoid in order to protect patients' health and safety. This project discusses the design and evaluation of a system that automates portions of data entry workflow, focusing on prescription information originating from fax forms. The first part of the thesis discusses the approaches used for faxes of a known format, using techniques including denoising, deskewing, template matching, and handwritten digit recognition. One successful task in this area was checkbox detection to identify whether prescriptions were renewed or denied. The second part of the thesis focuses on faxes of unknown formats, utilizing optical character recognition (OCR) technology and a customized implementation of an approximate string matching algorithm. Customer and prescriber information were extracted with high accuracy, and drug name extraction was investigated with suggestions for further improvement.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, June, 2019Cataloged from student-submitted PDF of thesis.Includes bibliographical references (pages 61-63).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.