Automating data extraction from prescription document images to reduce human error
Author(s)
Zahray, Lisa(Lisa A.)
Download1220877663-MIT.pdf (2.982Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Matt Pokress and George Verghese.
Terms of use
Metadata
Show full item recordAbstract
Manual data entry from a form into a database is a time consuming and error-prone task. In the case of prescription documents, errors are especially important to avoid in order to protect patients' health and safety. This project discusses the design and evaluation of a system that automates portions of data entry workflow, focusing on prescription information originating from fax forms. The first part of the thesis discusses the approaches used for faxes of a known format, using techniques including denoising, deskewing, template matching, and handwritten digit recognition. One successful task in this area was checkbox detection to identify whether prescriptions were renewed or denied. The second part of the thesis focuses on faxes of unknown formats, utilizing optical character recognition (OCR) technology and a customized implementation of an approximate string matching algorithm. Customer and prescriber information were extracted with high accuracy, and drug name extraction was investigated with suggestions for further improvement.
Description
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, June, 2019 Cataloged from student-submitted PDF of thesis. Includes bibliographical references (pages 61-63).
Date issued
2019Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.