MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Predicting unknown adverse drug reactions using an unsupervised node embedding algorithm

Author(s)
Das, Sourav.
Thumbnail
Download1144999394-MIT.pdf (6.555Mb)
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Lalana Kagal.
Terms of use
MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
Defined as undesirable effects of a medication that occur during or after usual clinical use, Adverse Drug Reactions (ADRs) pose a major health risk and result in the hospitalization of millions of patients each year. While pre-marketing clinical trials evaluate the safety and efficacy of a new drug, post-marketing surveillance identifies and monitors ADRs that were not previously identified during trials. Traditionally, most approaches tend to focus on ADR detection in the post-marketing phase. Also current approaches mostly use supervised machine learning, requiring significant preprocessing of the data and feature engineering. I developed a customizable framework based on unsupervised learning that allows users to run prediction tasks on different types of labeled graph data. The framework first creates a knowledge graph from the data and then uses an unsupervised algorithm to create embeddings (vector representations) of the nodes in the knowledge graph, and finally runs the prediction task. The framework enables an embedding to be learned for any newly added node as long as it is connected with the other nodes, and users can create embeddings for any pre-marketed drug as long as its related drug attributes are present in the knowledge graph. Using DrugBank and FAERS, I created a knowledge graph of drugs and drug attributes. To emulate drugs in the pre-marketing stage, I removed all the drug-ADR edges in the test dataset. Then, I experimented with different parameters of the node embedding algorithm and three different classifiers namely MLP, KNN and random forest. The models were trained to predict 9 different ADR associations for any drug, and our results showed that the MLP classifier was the best model with an AUROC score of 0.79, which is comparable to existing approaches but with much greater customizability. This approach has potential to improve how ADRs are predicted and allow them to be detected at a far earlier stage thus improving patient safety
Description
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
 
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
 
Cataloged from student-submitted PDF version of thesis.
 
Includes bibliographical references (pages 67-68).
 
Date issued
2019
URI
https://hdl.handle.net/1721.1/124239
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.