Lecture 4: Database Design

Lectures: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23

Overview

There are two papers assigned for this lecture. They are:

Hellerstein, Joseph, and Michael Stonebraker. "The Anatomy of a Database System." In Readings in Database Systems. San Fransisco, CA: Morgan Kaufmann, 1998. ISBN: 1558605231 (aka the Red Book). Focus on Sections 1-4, though you should also read Sections 5.1 and 5.2 and skim Section 6.
Astrahan, M. M., et al. "System R: Relational Approach to Database Management." ACM TODS 1, no. 2 (1976): 97-137. Read up to page 122; you may also skip the "Optimizer" section, pages 110-114.

The purpose of these readings is to introduce the architecture of a database system at a high level. Our goal in lecture will be to tweeze apart the main components of most database systems. Once we've identified these components, we will discuss each of them over the next few weeks.

Both of these papers assume a certain degree of familiarity with database 'lingo', some of which will doubtless be unfamiliar to you. As you read, keep track of terms you do not know and come to class prepared to ask questions!

Also, as you read, think about and come to class prepared to answer the following questions:

What is the purpose of the division between RDS and RSS in System R? Is there something fundamental about this design?
Why are process models in database systems important? Under what circumstances would I want multiple processes in my database? Are there any circumstances in which a "process per query" model would be preferable to a "thread per query" model?
What is the iterator model? Why is the iterator model convenient? Can you think of circumstances under which the iterator model is a bad idea?