An application of advanced statistical techniques to forecast the demand for air transportation
Author(s)Mathaisel, Dennis F. X.; Taneja, Nawal K.
Statistical techniques to forecast the demand for air transportation
Massachusetts Institute of Technology. Flight Transportation Laboratory
MetadataShow full item record
Introduction and objectives: For some time now regression models, often calibrated using the ordinary least-squares (OLS) estimation procedure, have become common tools for forecasting the demand for air transportation. However, in recent years more and more decision makers have begun to use these models not only to forecast traffic, but also for analyzing alternative policies and strategies. Despite this increase in scope for the use of these models for policy analysis, few analysts have investigated in depth the validity and precision of these models with respect to their expanded use. In order to use these models properly and effectively it is essential not only to understand the underlying assumptions and their implications which lead to the estimation procedure, but also to subject these assumptions to rigorous scrutiny. For example, one of the assumptions that is built into the ordinary least-squares estimation technique is that the explanatory variables should not be correlated among themselves. If the variables are fairly collinear, then the sample variance of the coefficient estimators increases significantly, which results in inaccurate estimation of the coefficients and uncertain specification of the model with respect to inclusion of those explanatory variables. As a corrective procedure, it is a common practice among demand analysts to drop those explanatory variables out of the model for which the t-statistic is insignificant. This is not a valid procedure since if collinearity is present the increase in variance of the coefficients will result in lower values of the t-statistic and rejection from the demand model of those explanatory variables which in theory do explain the variation in the dependent variable. Thus, if one or more of the assumptions underlying the OLS estimation procedure are violated, the analyst must either use appropriate correction procedures or use alternative estimation techniques. The purpose of the study herein is three-fold: (1) develop a "good" simple regression model to forecast as well as analyze the demand for air transportation; (2) using this model, demonstrate the application of various statistical tests to evaluate the validity of each of the major assumptions underlying the OLS estimation procedure with respect to its expanded use of policy analysis; and, (3) demonstrate the application of some advanced and relatively new statistical estimation procedures which are not only appropriate but essential in eliminating the common problems encountered in regression models when some of the underlying assumptions in the OLS procedure are violated. The incentive for the first objective, to develop a relatively simple single equation regression model to forecast as well as analyze the demand for air transportation (as measured by revenue passenger miles in U.S. Domestic trunk operations), stemmed from a recently published study by the U.S. Civil Aeronautics Board [CAB, 1976]. In the CAB study a five explanatory variable regression equation was formulated which had two undesirable features. The first was the inclusion of time as an explanatory variable. The use of time is undesirable since, from a policy analysis point of view, the analyst has no "control" over this variable, and it is usually only included to act as a proxy for other, perhaps significant, variables inadvertently omitted from the equation. The second undesirable feature of the CAB model is the "delta log" form of the equation (the first difference in the logs of the variables),which allowed a forecasting interval of only one year into the future. This form was the result of the application of a standard correction procedure for collinearity among some of the explanatory variables. In view of these two undesirable features, it was decided to attempt to improve on the CAB model. In addition to the explanatory variables considered in the CAB study a number of other variables were analyzed to determine their appropriateness in the model. Sections II and III of this report describe the total set of variables investigated as well as a method for searching for the "best" subset. Then, Section IV outlines the decisions involved in selecting the appropriate form of the equation. The second objective of this study is to describe a battery of statistical tests, some common and some not so common, which evaluate the validity of each of the major assumptions underlying the OLS estimation procedure with respect to single equation regression models. The major assumptions assessed in Section V of this report are homoscedasticity, normality, autocorrelation, and multicollinearity. The intent here is not to present all of the statistical tests that are available, for to do so would be the purpose of regression textbooks, but to scrutinize these four major assumptions enough to remind the analyst that it is essential to investigate in depth the validity and precision of the model with respect to its expanded use of policy analysis. It is hopeful that the procedure outlined in this report sets an example to demand modeling analysts of the essential elements used in the development of reliable forecasting tools. The third and ultimate objective of this work is to demonstrate the use of some advanced corrective procedures in the event that any of the four above mentioned assumptions have been violated. For example, the problem of autocorrelation can be resolved by the use of generalized least-squares(GLS), which is demonstrated in Section VI of this report; and the problem of multicollinearity , usually corrected by employing the cumbersome and restrictive delta log form of equation, has been eliminated by using Ridge regression (detailed in Section VII). Finally, in Section VIII an attempt is made to determine the "robustness" of a model by first performing an examination of the residuals using such techniques as the "hat matrix", and second by the application of the recently developed estimation procedures of Robust regression. Although the techniques of Ridge and Robust regression are still in the experimental stages, sufficient research has been performed to warrant their application to significantly improve the currently operational regression models.
August 1977Includes bibliographical references (p. 76-78)
Cambridge, Mass. : Massachusetts Institute of Technology, Flight Transportation Laboratory, 
FTL report (Massachusetts Institute of Technology. Flight Transportation Laboratory) ; R77-3
Airlines, Aeronautics, Aeronautics, Commercial, Management, Flights, Passenger traffic