Cost-effective conceptual design using taxonomies
Author(s)
Chodpathumwan, Yodsawalai; Vakilian, Ali; Termehchy, Arash; Nayyeri, Amir
Download778_2018_501_ReferencePDF.pdf (494.9Kb)
Publisher Policy
Publisher Policy
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.
Terms of use
Metadata
Show full item recordAbstract
It is known that annotating entities in unstructured and semi-structured datasets by their concepts improves the effectiveness of answering queries over these datasets. Ideally, one would like to annotate entities of all relevant concepts in a dataset. However, it takes substantial time and computational resources to annotate concepts in large datasets, and an organization may have sufficient resources to annotate only a subset of relevant concepts. Clearly, it would like to annotate a subset of concepts that provides the most effective answers to queries over the dataset. We propose a formal framework that quantifies the amount by which annotating entities of concepts from a taxonomy in a dataset improves the effectiveness of answering queries over the dataset. Because the problem is NP-hard, we propose efficient approximation and pseudo-polynomial time algorithms for several cases of the problem. Our extensive empirical studies validate our framework and show accuracy and efficiency of our algorithms.
Date issued
2018-03Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
VLDB Journal
Publisher
Springer Science and Business Media LLC
Citation
Chodpathumwan, Yodsawalai et al. "Cost-effective conceptual design using taxonomies." VLDB Journal 27, 3 (March 2018): 369–394 © 2018 Springer-Verlag
Version: Author's final manuscript
ISSN
1066-8888
0949-877X