Unsupervised Workflow Discovery in Provenance Graphs
Author(s)
Yue, Kevin
DownloadThesis PDF (1.345Mb)
Advisor
Cafarella, Michael
Terms of use
Metadata
Show full item recordAbstract
Data is an ever-expanding part of life in today’s world. Understanding the origin and the history of data - a concept known as data provenance - can thus be extremely important. In this thesis, we first address the need for a data provenance knowledge graph system, then address the need for being able to recover workflows that exist in such provenance networks, in an unsupervised manner. Along with evaluating the effectiveness of existing unsupervised community and motif detection methods, we also suggest a novel approach that augments standard motif detection. Our research shows weak precision and recall numbers for almost all considered approaches, but provides a promising basis for future experimentation using more multifaceted methods.
Date issued
2022-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology