| dc.contributor.author | Chodpathumwan, Yodsawalai | |
| dc.contributor.author | Vakilian, Ali | |
| dc.contributor.author | Termehchy, Arash | |
| dc.contributor.author | Nayyeri, Amir | |
| dc.date.accessioned | 2020-11-12T22:10:56Z | |
| dc.date.available | 2020-11-12T22:10:56Z | |
| dc.date.issued | 2018-03 | |
| dc.date.submitted | 2018-01 | |
| dc.identifier.issn | 1066-8888 | |
| dc.identifier.issn | 0949-877X | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/128468 | |
| dc.description.abstract | It is known that annotating entities in unstructured and semi-structured datasets by their concepts improves the effectiveness of answering queries over these datasets. Ideally, one would like to annotate entities of all relevant concepts in a dataset. However, it takes substantial time and computational resources to annotate concepts in large datasets, and an organization may have sufficient resources to annotate only a subset of relevant concepts. Clearly, it would like to annotate a subset of concepts that provides the most effective answers to queries over the dataset. We propose a formal framework that quantifies the amount by which annotating entities of concepts from a taxonomy in a dataset improves the effectiveness of answering queries over the dataset. Because the problem is NP-hard, we propose efficient approximation and pseudo-polynomial time algorithms for several cases of the problem. Our extensive empirical studies validate our framework and show accuracy and efficiency of our algorithms. | en_US |
| dc.description.sponsorship | National Science Foundation (Grants IIS-1421247, CCF-0938071, CCF-0938064 and CNS-0716532) | en_US |
| dc.publisher | Springer Science and Business Media LLC | en_US |
| dc.relation.isversionof | https://doi.org/10.1007/s00778-018-0501-1 | en_US |
| dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
| dc.source | Springer Berlin Heidelberg | en_US |
| dc.title | Cost-effective conceptual design using taxonomies | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Chodpathumwan, Yodsawalai et al. "Cost-effective conceptual design using taxonomies." VLDB Journal 27, 3 (March 2018): 369–394 © 2018 Springer-Verlag | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
| dc.relation.journal | VLDB Journal | en_US |
| dc.eprint.version | Author's final manuscript | en_US |
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
| eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
| dc.date.updated | 2020-09-24T21:00:01Z | |
| dc.language.rfc3066 | en | |
| dc.rights.holder | Springer-Verlag GmbH Germany, part of Springer Nature | |
| dspace.embargo.terms | Y | |
| dspace.date.submission | 2020-09-24T21:00:01Z | |
| mit.journal.volume | 27 | en_US |
| mit.journal.issue | 3 | en_US |
| mit.license | PUBLISHER_POLICY | |
| mit.metadata.status | Complete | |