Pretraining Table Embeddings for Knowledge Graph Based Provenance Systems
Author(s)
Yang, Steven
DownloadThesis PDF (1.493Mb)
Advisor
Cafarella, Michael J.
Terms of use
Metadata
Show full item recordAbstract
We aim to build a knowledge graph based provenance system for data objects across institutions and teams. The world of data objects and systems is complex and heterogeneous. For effective collaboration, a shared data model is needed. Specifically, this work examines the problem of provenance subgraph classification: given a coarser low-level provenance subgraph that is not easily digestible by humans, we want to annotate the subgraph with human readable labels describing the operations done on each data object. This work first involves creating the infrastructure needed to select and label subgraphs. Next, this work focuses on producing table embedding techniques using the pretrain and finetune paradigm with an emphasis on the downstream task of Operator Classification.
Date issued
2022-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology