Show simple item record

dc.contributor.authorChen, Kuang
dc.contributor.authorChen, Harr
dc.contributor.authorConway, Neil
dc.contributor.authorHellerstein, Joseph M.
dc.contributor.authorParikh, Tapan S.
dc.date.accessioned2012-04-04T19:49:20Z
dc.date.available2012-04-04T19:49:20Z
dc.date.issued2010-03
dc.identifier.isbn978-1-4244-5444-0
dc.identifier.isbn978-1-4244-5445-7
dc.identifier.otherINSPEC Accession Number: 11258785
dc.identifier.urihttp://hdl.handle.net/1721.1/69935
dc.description.abstractData quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for improving data quality at entry time. In this paper, we propose USHER, an end-to-end system for form design, entry, and data quality assurance. Using previous form submissions, USHER learns a probabilistic model over the questions of the form. USHER then applies this model at every step of the data entry process to improve data quality. Before entry, it induces a form layout that captures the most important data values of a form instance as quickly as possible. During entry, it dynamically adapts the form to the values being entered, and enables real-time feedback to guide the data enterer toward their intended values. After entry, it re-asks questions that it deems likely to have been entered incorrectly. We evaluate all three components of USHER using two real-world data sets. Our results demonstrate that each component has the potential to improve data quality considerably, at a reduced cost when compared to current practice.en_US
dc.description.sponsorshipYahoo! Research Labs (Yahoo Labs Technology for Good Fellowship)en_US
dc.description.sponsorshipNational Science Foundation (U.S.). Graduate Research Fellowship Programen_US
dc.description.sponsorshipNational Science Foundation (U.S.) (Grant 0713661)en_US
dc.language.isoen_US
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/ICDE.2010.5447832en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceIEEEen_US
dc.titleUSHER: Improving data quality with dynamic formsen_US
dc.typeArticleen_US
dc.identifier.citationChen, Kuang et al. “USHER: Improving Data Quality with Dynamic Forms.” IEEE, 2010. 321–332. Web. 4 Apr. 2012. © 2010 Institute of Electrical and Electronics Engineersen_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.approverHellerstein, Joseph M.
dc.contributor.mitauthorChen, Harr
dc.relation.journal2010 IEEE 26th International Conference on Data Engineering (ICDE)en_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
dspace.orderedauthorsChen, Kuang; Chen, Harr; Conway, Neil; Hellerstein, Joseph M.; Parikh, Tapan S.en
mit.licensePUBLISHER_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record