Lecture 2: The Relational Model

Lectures: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23

Overview

In this lecture, we will continue our discussion of data models and database system architecture, looking in more detail at the relational model.

There is a lot of reading for this lecture. You should start early and try to digest it all, as it lays the foundation for much of what is to come. The papers are:

Michael Stonebraker and Joseph Hellerstein. "What Goes Around Comes Around." In Readings in Database Systems. San Fransisco, CA: Morgan Kaufmann, 1998. ISBN: 1558605231 (aka the Red Book). Read Sections 1-4 (if you know something about XML, you may also enjoy reading Sections 10 and 11; they are classic Stonebraker).
E. F. Codd. "A Relational Model of Data for Large Shared Data Banks." Communications of the ACM 13, no. 6 (1970): 377-387. Focus on Sections 1.3 and all of Section 2.

You may also find it useful to read the following text for a brief overview of the relational model:

Ramakrishnan, Raghu, and Johannes Gehrke. Database Management Systems. New York, NY: McGraw-Hill, 2002, pp. 57-63. ISBN: 0072465638.

As you read these papers, think about and be prepared to answer the following questions in lecture:

What is the notion of data independence? Why is it important?
Codd spends a fair amount of time talking about "Normal forms." Why is it important that a database be stored in a normal form?
What are the key ideas behind the relational model? Why are they an improvement over what came before? In what ways is the relational model restrictive?
What, according to Codd, are the most important differences between the "hierarchical" model (as exemplified by systems like IMS) and the relational model that Codd proposes? Make sure you understand what Codd means by "Data Dependencies."