|Strengthening Policy Analysis - Econometric tests using microcomputer software + disk (IFPRI, 1995, 166 p.)|
Observe a swimmer trying to simultaneously submerge five inflatable beach balls. The swimmer struggles for some time. When the swimmer finally succeeds, he or she has a photograph taken. The swimmer quickly loses control and the balls explode above the surface of the water. The photographer is an econometrician.
- Anonymous econometrics professor
In the absence of comprehensive and well-presented empirical analyses, we had no option but to follow our political instincts.
- Anonymous policymaker
Improvements in econometric methods and the machines that run them, alongside dramatic increases in the quality and quantity of information available to inform policymakers, are bridging the gap between what is known and what is needed to guide policy. These developments make it easier for econometricians to model policy with some degree of confidence. The developments also make empirically based research more accessible to policy analysts. Likewise, policymakers can and should have more options than the anonymous policymaker quoted above - and, indeed, this new wealth of information forces them to look beyond political instincts for guidance. This manual contains critical structural support for the evolving bridge between policy needs and knowledge.
There is, however, both good and bad news associated with the explosion in the use of econometric procedures and tests in policy research witnessed in the past 10 years. The bad news is that this trend has led to a growing realization that estimated policy parameters are highly sensitive to the ways in which data are handled and the ways in which econometric models are constructed. The good news is that the ability to conduct tests that can gauge these sensitivities has improved with the emergence of powerful microcomputers, of statistical software that combines ease of use with statistical power, and of texts on applied econometrics. These developments permit econometricians and policy analysts to improve the accuracy and reliability of estimated policy parameters and, at the very least, to indicate where their models are most sensitive to specification error and departures from standard assumptions. The following example illustrates the usefulness of these tests for policy formulation.
Until recently, one widely accepted notion about development was that poverty alleviation was necessary and sufficient for reductions in undernutrition to occur. The implication was that the effect of income-generation policies on household food consumption and nutrition status is strong. Recent econometric work (Behrman and Deolalikar 1987; Bouis and Haddad 1992) has cast some doubt as to whether increasing income alone is sufficient to alleviate undernutrition. The policy choice revolves around the magnitude of the calorie-income elasticity, and by extension, the estimated coefficient of income (the marginal propensity to consume) when calorie consumption is the dependent variable. Bouis and Haddad (1992) found that, for the same households, two-stage least squares (2SLS) estimates differed from ordinary least squares (OLS) estimates. Calculation of the Levi bounds on the marginal propensity to consume indicated that income from the survey was measured with much error. A more formal Hausman-Wu test established that the differences between the OLS and 2SLS estimates were large enough to reject the use of OLS estimates because of their bias. The differences were due, in part, to the endogeneity of income on the right-hand side, which was caused in part by measurement error on income. The elasticity estimate was sensitive to the choice of estimator used. This sensitivity has consequences for policy formulation. If the larger elasticity estimates of approximately 0.5 are believed, policy can be more focused on income generation. If, on the other hand, the smaller elasticity estimates of approximately 0.1 are believed, the focus of policy perhaps should be expanded toward complementary factors for reducing undernutrition (such as education, community sanitation, water quality, and the availability of medical supplies), and toward the importance of other dimensions of undernutrition (such as micronutrient consumption and individual-level versus household-level consumption).
This manual outlines how to conduct some of these basic econometric specification tests and procedures, how to interpret the results, and how to modify the econometric approach as a result of the tests. The tests and procedures are largely confined to cross-section analyses, as opposed to time-series analyses. This reflects IFPRIs current research orientation as well as the nature of existing data used to support policy research in developing countries. The tests address a number of issues: (1) the validity of the assumption of normality and constant variance of the error term, (2) selection of the most appropriate explanatory variables to include in a model, (3) the appropriateness of the model under different structural conditions, (4) the need to account for measurement errors in explanatory variables, (5) how to detect and respond to outlier observations, and (6) what can be done (if anything) about missing data points.
Each test and procedure is described in terms of why, when, and how it might be used. Sample programs, with software code from SPSS/PC+,1 the SAS system for personal computers (hereafter referred to as SAS PC), and GAUSS-3862 are presented to demonstrate how the procedure can be executed with the sample data set (an ASCII file named DATA.ASC). These sample programs are also on the diskette that accompanies the manual. Keep in mind that there may be several alternative programming strategies in any given instance; generally only one is presented in this manual. At the end of each section, several widely used econometrics textbooks are listed that discuss the tests provided in the manualand alternatives that are not. Unless otherwise noted, all programs run in under 3 minutes using the sample data set on a DOS-based ZEOS 486DX2 desktop computer running at 66 megahertz. The econometric procedures selected are not exhaustive; rather, they reflect IFPRIs collective ongoing experience in using econometrics and widely available software packages for food policy analysis in developing countries.
1SPSS/PC+ for Windows, a recently released product, is not outlined in this manual.
2Companies producing the computer software mentioned in this manual are listed in the notes to Table 1
In addition, it is important to remember that parameter estimates are sensitive to the quality of data used as well as the appropriateness of the econometric approach. To that end, the reader is encouraged to make use of the first paper in this Microcomputers in Policy Research series, Designing a Data Entry and Verification System, by Peter A. Tatian. Finally, it is hoped that readers will alert the authors to any errors found in this manual, together with their suggestions for additional materials to include in future versions of this manual, and in the series generally.
Standard econometric notation is used throughout this manual. In general, Arabic letters refer to data matrices and Greek letters refer to model parameters and to stochastic error terms. The basic model is written as follows:
y=Xb + e.
In the model above, y is an N × 1 vector of observations on the dependent variable; X is an N × K matrix of observations on the K explanatory variables (including the constant term); b is a K × 1 vector of parameters; e is an N × 1 vector of unobservable stochastic disturbance terms; and N is the sample size.
It is generally assumed that the matrix X contains all of the appropriate regressors in the appropriate functional form, and that the classical normal assumptions concerning the stochastic disturbance terms hold: they have zero mean and are nonheteroskedastic, nonauto-correlated, uncorrelated with the regressors, and normally distributed. This manual is largely devoted to examining the definitions of X and to checking for heteroskedasticity and correlation between regressors and the stochastic disturbance term.
Extensions of this notation are required periodically in the manual and are introduced as needed. Usually, however, the dimensions of vectors and matrices, unless required for clarity, will not be repeated.