MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Query Optimization for Dynamic Imputation

Author(s)
Cambronero, José; Feser, John K.; Smith, Micah J.; Madden, Samuel
Thumbnail
DownloadPublished version (898.5Kb)
Terms of use
Creative Commons Attribution-NonCommercial-NoDerivs License http://creativecommons.org/licenses/by-nc-nd/4.0/
Metadata
Show full item record
Abstract
© 2017 VLDB. Missing values are common in data analysis and present a usability challenge. Users are forced to pick between removing tuples withmissing values or creating a cleaned version of their data by applying a relatively expensive imputation strategy. Our system, ImputeDB, incorporates imputation into a costbased query optimizer, performing necessary imputations onthefly for each query. This allows users to immediately explore their data, while the system picks the optimal placement of imputation operations. We evaluate this approach on three real-world survey-based datasets. Our experiments show that our query plans execute between 10 and 140 times faster than first imputing the base tables. Furthermore, we show that the query results from on-the-fly imputation differ from the traditional base-table imputation approach by 0-8%. Finally, we show that while dropping tuples with missing values that fail query constraints discards 6-78% of the data, on-the-fly imputation loses only 0-21%.
Date issued
2017-08
URI
https://hdl.handle.net/1721.1/137765
Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
Publisher
VLDB Endowment
Citation
Cambronero, José, Feser, John K., Smith, Micah J. and Madden, Samuel. 2017. "Query Optimization for Dynamic Imputation." 10 (11).
Version: Final published version
ISSN
2150-8097

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.