This is an archived course. A more recent version may be available at



Course Meeting Times

Lectures: 2 sessions / week, 1.5 hours / session

Recitations: 1 session / week, 1 hour / session


This course is an introduction to data analysis and applied statistics, including multiple regression, analysis of variance and nonparametric methods for students who have taken a course in probability and a course in linear algebra. It is a full semester course with three hours of lectures and a one-hour recitation each week. Data analysis is difficult without some computing tools and the recitations, in addition to answering questions about the course material, will introduce you to statistical computing.


Dr. Elizabeth Newton. Often, I am available after class, but the best way to see me is to schedule some time by phone or email. Office hours will be announced. Information about the course will be posted on the class web site.


Statistics and Data Analysis: From Elementary to Intermediate (SDA) by Ajit C. Tamhane and Dorothy D. Dunlop (Prentice Hall, 2000). I will put some other books on reserve. Information about the text and any errors it contains can be found at the authors' web site.


I will use overheads during lectures and will put copies on the class web site. Other details will be given on the board. You should read the material before lecture so that you have some idea of what will be discussed even if you don't understand everything. Please ask questions as I go along. Most of our time will be spent covering more difficult material rather than things you can understand easily. Class lectures definitely will not replace reading the textbook as I often will do examples rather than repeat details that are in the text. After a first reading and the lectures, you should attempt the homework. This will require you to reread the material and generate some new questions that you should bring up in recitation or in office hours.


Generally, the Teaching Assistant will conduct the recitations and cover material related to data analysis and statistical computing. There will also be time to discuss homework problems, examples, and clear up any confusion from lectures.

Grading and Exams

The idea is to have everyone learn the material. Grades are required, but low grades are an indication that both you and I have failed to do our job. If you are having problems, don't let them slide until the end. There will also be homework every week or ten days that will be graded and returned. (Sampling may be used; i.e. only a portion of the problems may be graded. However, solutions will be provided to all of them.) Once we have handed out the solution sheet for a homework set (usually at the next class after it is due), late homework will not be accepted.

The midterm will be a 1.5 hour in-class open book examination and the final will be a scheduled 3 hour open book examination. There may be quizzes during the semester covering material that is emphasized in lecture if this appears to be necessary.

Without quizzes, the homework will count 35%, class participation 10%, midterm 20% and final 35%. If quizzes become necessary these percentages will be adjusted accordingly.


Many data analysis packages are available and this semester we will use S-PLUS® which is easily available at MIT and widely used in teaching and industry. However, if you wish, you may use anything else but we cannot promise support. The server has S-PLUS®, SAS® and STATA®. The Sloan Computing Labs support S-PLUS®, SAS®, JMP®, SPSS®, and STATA®. MATLAB® is also a possibility for those who want to do some programming. We have found that some knowledge of S-PLUS®, SAS® or SPSS® can lead to good academic year, summer, and permanent jobs.

There are many introductory books on S-PLUS®, including An Introduction to S-Plus® for Windows by Longhow Lam, or The Basics of S and S-Plus®, by Krause and Olson.

Datasets not on the disks at the back of the textbooks will be on the fileserver.

Home Computing

There are student versions of S-PLUS®, SAS®, JMP®, STATA® and SPSS® that allow you to have these packages on your home computer. An S-PLUS® 6.1 CD will be available which students can copy for installation on their home machine.

Work Load

This is a 4-0-8 course. We will have three main hours of lecture and one additional hour of recitation or demonstration each week. Homeworks should take the median student about 8 hours each week. If we have misjudged this load (most often because computing can take more time than we think), please let us know and we will see how the rest of the class feels as well.


Please let me (or the TA) know (anonymously, if you wish) what is going right and what is going wrong with lectures, homework, content, etc. Filling out forms at the end of the course will help future students, but will not help you while you are taking the course. It is far better for all of us if we can work on these problems during the course and leave you satisfied at the end.

Academic Honesty

It is best to attempt the homework on your own and then ask us questions. In a pinch, talk to your classmates for clarification. What goes on your homework paper should be your own work. As a statistician, I expect variation among students both in correct and incorrect solutions. Lack of such variation has led to embarrassing questions and reduced grades in the past. The exams should, of course, be entirely your own work. Any evidence of cheating will result in a failing grade for the course and disciplinary action through the appropriate MIT procedures and committees. This applies to those who give help as well as to those who receive it.

MIT's academic honesty policy can be found at MIT Policies and Procedures.