Show simple item record

dc.contributor.authorAbedjan, Ziawasch
dc.contributor.authorGolab, Lukasz
dc.contributor.authorNaumann, Felix
dc.date.accessioned2016-12-29T19:39:40Z
dc.date.available2016-12-29T19:39:40Z
dc.date.issued2015-06
dc.date.submitted2015-05
dc.identifier.issn1066-8888
dc.identifier.issn0949-877X
dc.identifier.urihttp://hdl.handle.net/1721.1/106176
dc.description.abstractProfiling data to determine metadata about a given dataset is an important and frequent activity of any IT professional and researcher and is necessary for various use-cases. It encompasses a vast array of methods to examine datasets and produce metadata. Among the simpler results are statistics, such as the number of null values and distinct values in a column, its data type, or the most frequent patterns of its data values. Metadata that are more difficult to compute involve multiple columns, namely correlations, unique column combinations, functional dependencies, and inclusion dependencies. Further techniques detect conditional properties of the dataset at hand. This survey provides a classification of data profiling tasks and comprehensively reviews the state of the art for each class. In addition, we review data profiling tools and systems from research and industry. We conclude with an outlook on the future of data profiling beyond traditional profiling tasks and beyond relational databases.en_US
dc.publisherSpringer Berlin Heidelbergen_US
dc.relation.isversionofhttp://dx.doi.org/10.1007/s00778-015-0389-yen_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceSpringer Berlin Heidelbergen_US
dc.titleProfiling relational data: a surveyen_US
dc.typeArticleen_US
dc.identifier.citationAbedjan, Ziawasch, Lukasz Golab, and Felix Naumann. “Profiling Relational Data: A Survey.” The VLDB Journal 24.4 (2015): 557–581.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.mitauthorAbedjan, Ziawasch
dc.relation.journalThe VLDB Journalen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dc.date.updated2016-08-18T15:28:35Z
dc.language.rfc3066en
dc.rights.holderSpringer-Verlag Berlin Heidelberg
dspace.orderedauthorsAbedjan, Ziawasch; Golab, Lukasz; Naumann, Felixen_US
dspace.embargo.termsNen
dc.identifier.orcidhttps://orcid.org/0000-0003-3483-0523
mit.licensePUBLISHER_POLICYen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record