This is an archived course. A more recent version may be available at ocw.mit.edu.

Lecture 20: Online Query Processing

Lectures: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23

Overview

Papers:

  • Hellerstein, Joseph, Ron Avnur, and Vijayshankar Raman. "Informix under CONTROL: Online Query Processing." Data Mining and Knowledge Discovery 12 (2000): 281-314.
  • Also in Amazon logo Readings in Database Systems. San Fransisco, CA: Morgan Kaufmann, 1998. ISBN: 1558605231.

The CONTROL project investigated online query processing, which is yet another "continuous query processing"-like system. This work immediately preceded the Eddies paper that we read earlier and is focused on producing a continuously updated set of answers to a long running query in a traditional relational database. Answers include some notion of uncertainty that decreases as the query runs.

As you read the paper, consider the following questions:

  1. When is the CONTROL approach appropriate? What kinds of users would find it useful?
  2. Why does the CONTROL system need to randomly sample the data? How does it do this?
  3. How does the CONTROL system measure uncertainty? Is this metric appropriate? Are the assumptions used to measure uncertainty reasonable?
  4. Are any of the CONTROL techniques useful in the streaming systems we have studied? Do streaming systems obviate the need for CONTROL?