This is an archived course. A more recent version may be available at ocw.mit.edu.

Lecture 17: Search Engines

Lectures: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23

Overview

Papers:

  • Amazon logo Brewer, Eric. "Combining Systems and Databases: A Search Engine Retrospective." In Readings in Database Systems. San Fransisco, CA: Morgan Kaufmann, 1998. ISBN: 1558605231.
  • Dean, Jeffrey, and Sanjay Ghemawat. "Map Reduce: Simplified Data Processing on Large Clusters." OSDI (2004): 137-150.

The first paper talks about how databases relate search engines and gives some of the basics of the functioning of a search engine. The second paper talks about a specific implementation of a simple query system (called Map-Reduce) on top of the Google cluster.

As you read the papers, consider the following questions:

  1. In the "Search Engine Retrospective", what features of database systems does the author recommend that designers of search engines adopt? Why?
  2. What does Brewer claim are the primary differences between search engines and databases? What issues do search engine designers not have to worry about that database designers often focus on?
  3. What is the CAP theorem?
  4. What kinds of failures can a search engine (or the Map-Reduce system) tolerate? What consistency guarantees are provided in the face of failures?