Suppose we wish to estimate the effect of the federally funded school lunch program on student performance. The file MEAP93.xls contains data on 408 high schools in Australia for the year 1993.
- math10: percentage of tenth graders at a high school receiving a passing scoreon a standardized mathematics exam.
- expend: expenditures per student (in dollars).
- lnchprg: percentage of students who are eligible for the federal free lunch
- enrol: number of student enrolment which measures school size
- Consider a regression model
Estimate the model using Eviews and write down the fitted equation (including the sample size, t-statistic, and R-squared).
(b) Test the overall validity of the regression model at the 5% significance level.
(c) Does the lunch program increase student performance? Test the hypothesis at the 5% level.
(d) Based on the regression output, if expend increase by 10% what is the estimated percentage point change in math10, holding lnchprg and enroll constant.
(e) Now run the simple regression of math10 on lnchprg such that
Compare the Eviews output with that from part (a). Which model will you choose? Why? [Hint: should we use R2 or adjusted R2 for model comparison?]
Suppose a researcher wants to analysis the short-term interest rate, and she develops a model given by
where is the three-month T-bill rate, is the annual inflation rate based on the consumer price index (CPI), and is the federal budget surplus or deficit as a percentage of GDP. He collects the yearly data for the period from 1960 to 2016 (57 observations) from the Federal Reserve Economic Data. They are in the Excel file Interest_rate.xlsx.
- Plot the three variables in a line graph using Eviews and comment on the dynamics (any co-movement between interest rate and the other two variables).
- Estimate the regression model using Eviews and provide the output.
- What does the coefficient of determination tell you?
- Interpret the coefficients ?B1and ?B2 . Do the signs of the estimated coefficients agree with standard economic intuition?
- Conduct a test of H0: there is no second order serial correlation in the errors of this model at the 5% significance level.
- Re-estimate the equation with Newy-West standard errors and provide the Eviews output. Compare the serial correlation consistent estimation with output from Part (b).
- Test the null hypothesis that a one percent increase in the annual inflation rate leads to a one percent increase in the short-term interest rate on average at the 5% level.
In the file pubexp.xls there are data on public expenditure on education (EE), gross domestic product (GDP), and population (P) for 34 countries in the year 1980. It is hypothesized that per capita expenditure on education is linearly related to per capita GDP. That is
Import the data into Eviews.
- It is suspected that . Why might the suspicion
may be heteroskedastic with a variance related to about heteroskedasticity be reasonable?
- Draw a scatter plot between x and y using Eviews and comment on the graph (what is the expected relationship and any evidence of heteroskedasticity). Hint: To generate variable x using Eviews, click Genr, type in x=GDP/P in the box.
- Run the OLS regression and provide the Eviews output. Does the sign of the slope coefficient make sense? Explain.
- Test for the existence of heteroskedasticity using a White test (assuming a 5% level of significance).
- Re-estimate the equation with White consistent standard errors and provide the Eviews output. Compare the heteroskedasticity consistent estimation with output from part (c).
Model Estimation with Eviews
a.
The regression output is given as
Dependent Variable: MATH10 |
||||
Method: Least Squares |
||||
Date: 06/08/18 Time: 13:07 |
||||
Sample: 1 408 |
||||
Included observations: 408 |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
-23.13766 |
24.99323 |
-0.925757 |
0.3551 |
LOG(EXPEND) |
7.746062 |
3.041386 |
2.546885 |
0.0112 |
LNCHPRG |
-0.323927 |
0.036319 |
-8.918835 |
0.0000 |
LOG(ENROL) |
-1.255435 |
0.581173 |
-2.160175 |
0.0313 |
R-squared |
0.189291 |
Mean dependent var |
24.10686 |
|
Adjusted R-squared |
0.183271 |
S.D. dependent var |
10.49361 |
|
S.E. of regression |
9.483400 |
Akaike info criterion |
7.346718 |
|
Sum squared resid |
36333.69 |
Schwarz criterion |
7.386044 |
|
Log likelihood |
-1494.731 |
Hannan-Quinn criter. |
7.362280 |
|
F-statistic |
31.44310 |
Durbin-Watson stat |
1.906937 |
|
Prob(F-statistic) |
0.000000 |
From the regression result, the fitted equation is obtained as
The overall significance of the model can be tested to examine overall significance of the model.
Null hypothesis: Coefficients of all the independent variables are zero
Alternative hypothesis: At least one of the coefficients is significantly different from zero.
The significant p value F statistics from the regression model is obtained as 0.00. The p value is less than the significant value of 0.05. This indicates rejection of null hypothesis of neither of coefficients are statistically significant. The result thus suggests that the model has overall significance.
As obtained from the regression result the coefficient of lunch program is -0.32. The negative coefficient indicates an inverse relation between lunch program and percentage of tenth grade on a standardized math exam. This implies with increase in lunch program obtained grades of the students decreases. P value of the coefficient is 0.0000. As the p value is smaller than the level of significance 0.05, the null hypothesis of no significant relation between lunch-program and students’ performance is rejected at 5% level of significance. Therefore, lunch-program fails to increase performance of the student rather it influences student performance negatively.
The coefficient of log(expend) is 7.74. This indicates a 10% increase in expend will increase math10 by (7.74* 10) = 77.4 percent.
Dependent Variable: MATH10 |
||||
Method: Least Squares |
||||
Date: 06/08/18 Time: 14:22 |
||||
Sample: 1 408 |
||||
Included observations: 408 |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
32.14271 |
0.997582 |
32.22061 |
0.0000 |
LNCHPRG |
-0.318864 |
0.034839 |
-9.152422 |
0.0000 |
R-squared |
0.171034 |
Mean dependent var |
24.10686 |
|
Adjusted R-squared |
0.168992 |
S.D. dependent var |
10.49361 |
|
S.E. of regression |
9.565938 |
Akaike info criterion |
7.359184 |
|
Sum squared resid |
37151.91 |
Schwarz criterion |
7.378847 |
|
Log likelihood |
-1499.274 |
Hannan-Quinn criter. |
7.366965 |
|
F-statistic |
83.76683 |
Durbin-Watson stat |
1.907745 |
|
Prob(F-statistic) |
0.000000 |
In both the model, the variable lunch program is negative and statistically significant. The magnitude of the coefficient in both the model is equivalent to -0.32. For the model in part (a) the value of adjusted R square is 0.18. In the new model, the R square value is 0.16. The R square value indicates goodness of fit of the model. This explains how much variation in the dependent variable is explained by the independent variable. Higher the R square value better is fitted model. In terms of R square value, model 1 is more acceptable as compared to model 2.
a.
Figure 1: Line plot of T-bill rate, inflation rate and federal budget balance
As shown from the figure above, interest rate and inflation rate moves in the same direction. Movement of interest rate follows the movement of inflation indicating a positive association between them. In most of the times, interest rate is above inflation rate. The variable federal budget does not show any clear co-movement with Treasury bill interest rate.
Testing Overall Validity of Regression Model
The regression output for the concerned regression model is given as
Dependent Variable: I3 |
||||
Method: Least Squares |
||||
Date: 06/08/18 Time: 12:04 |
||||
Sample: 1960 2016 |
||||
Included observations: 57 |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
2.056980 |
0.556708 |
3.694899 |
0.0005 |
INF |
0.817576 |
0.096280 |
8.491619 |
0.0000 |
DEF |
0.190321 |
0.115244 |
1.651455 |
0.1045 |
R-squared |
0.590980 |
Mean dependent var |
4.687295 |
|
Adjusted R-squared |
0.575831 |
S.D. dependent var |
3.109814 |
|
S.E. of regression |
2.025369 |
Akaike info criterion |
4.300576 |
|
Sum squared resid |
221.5143 |
Schwarz criterion |
4.408105 |
|
Log likelihood |
-119.5664 |
Hannan-Quinn criter. |
4.342366 |
|
F-statistic |
39.01139 |
Durbin-Watson stat |
0.444886 |
|
Prob(F-statistic) |
0.000000 |
Estimated regression equation
Value of R square gives coefficient of determination. It indicates how much of the variation in dependent variable can be explained by the independent variables. In the present model, the R square value is given as 0.59. This implies inflation and balance in federal budget can together explain 59 percent variation in interest rate.
The value of inflation coefficient is 0.82. This implies there is a positive relation between rate of inflation and 3 months T-bill rate. That is as inflation increases return to 3 months T-bill rate increases. The p value of the coefficient is 0.00. P value less than the level of significance value 0.05 indicates rejection of null hypothesis of no significant relation between inflation and interest on T-bill rate. The coefficient of federal budget is 0.19. The positive co-efficient indicates that federal budget positively influences interest on T-bill rate. P value of the coefficient 0.10. The value is greater than significance level of 0.05. As the p value is greater than significant value, null hypothesis of no significant relation between T–bill rate and budget surplus is rejected.
Inflation and expected inflation affect the interest rate on Treasury bill. The period of high inflation is generally associated with a high interest rate. From the regression result, a positive significant relation is obtained between inflation and 3 months T-bill rate. The sign of inflation coefficient is thus consistent with standard economic intuition. The federal budget deficit is associated with an inflationary pressure. In presence budget deficit, central bank purchases securities issued by the government. This raises growth of monetary base crating inflationary pressure. This has a positive impact on interest rate. The variable budget deficit though is not statistically significant but it has the expected sign.
Test of serial autocorrelation
Hypothesis
Null hypothesis: There is no second order serial autocorrelation in the model at 5% level of significance
Alternative hypothesis: There exists a second order serial autocorrelation in the model at 5% level of significance.
Auxiliary regression
The auxiliary regression regress current values of residuals on all the explanatory variables and is related with lagged residual terms. The Breusch- Godfrey test statistics is given as (T – p)*R2 , where T is the number of observation and p is the number of lagged residual terms. The test statistics follows a chi-square distribution with p degrees of freedom.
Effect of Lunch Program on Student Performance
Decision rule
The null hypothesis is rejected if P value of the Breusch-Godfrey test statistics is less than 0.05.
Conclusion
From the test result, p value of the LM statistics is obtained as 0.0000. As the p value is less than 0.05, the null hypothesis of no second order serial autocorrelation exists in the model is rejected. This implies the model has the problem of second order autocorrelation.
Breusch-Godfrey Serial Correlation LM Test: |
||||
F-statistic |
39.78589 |
Prob. F(2,52) |
0.0000 |
|
Obs*R-squared |
34.47237 |
Prob. Chi-Square(2) |
0.0000 |
|
Test Equation: |
||||
Dependent Variable: RESID |
||||
Method: Least Squares |
||||
Date: 06/08/18 Time: 12:50 |
||||
Sample: 1960 2016 |
||||
Included observations: 57 |
||||
Presample missing value lagged residuals set to zero. |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
-0.148451 |
0.357065 |
-0.415752 |
0.6793 |
INF |
0.003478 |
0.061905 |
0.056180 |
0.9554 |
DEF |
-0.040788 |
0.074426 |
-0.548033 |
0.5860 |
RESID(-1) |
0.738955 |
0.140036 |
5.276884 |
0.0000 |
RESID(-2) |
0.062394 |
0.140898 |
0.442831 |
0.6597 |
R-squared |
0.604778 |
Mean dependent var |
-7.64E-16 |
|
Adjusted R-squared |
0.574377 |
S.D. dependent var |
1.988872 |
|
S.E. of regression |
1.297536 |
Akaike info criterion |
3.442443 |
|
Sum squared resid |
87.54723 |
Schwarz criterion |
3.621658 |
|
Log likelihood |
-93.10962 |
Hannan-Quinn criter. |
3.512092 |
|
F-statistic |
19.89295 |
Durbin-Watson stat |
1.995246 |
|
Prob(F-statistic) |
0.000000 |
Newy-West standard error estimation model
Dependent Variable: I3 |
||||
Method: Least Squares |
||||
Date: 06/10/18 Time: 10:12 |
||||
Sample: 1960 2016 |
||||
Included observations: 57 |
||||
HAC standard errors & covariance (Bartlett kernel, Newey-West fixed |
||||
bandwidth = 4.0000) |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
2.056980 |
0.525294 |
3.915866 |
0.0003 |
INF |
0.817576 |
0.129667 |
6.305189 |
0.0000 |
DEF |
0.190321 |
0.178272 |
1.067589 |
0.2905 |
R-squared |
0.590980 |
Mean dependent var |
4.687295 |
|
Adjusted R-squared |
0.575831 |
S.D. dependent var |
3.109814 |
|
S.E. of regression |
2.025369 |
Akaike info criterion |
4.300576 |
|
Sum squared resid |
221.5143 |
Schwarz criterion |
4.408105 |
|
Log likelihood |
-119.5664 |
Hannan-Quinn criter. |
4.342366 |
|
F-statistic |
39.01139 |
Durbin-Watson stat |
0.444886 |
|
Prob(F-statistic) |
0.000000 |
Wald F-statistic |
21.38973 |
|
Prob(Wald F-statistic) |
0.000000 |
The result of Newy-West standard error consistent model gives exactly same result as that obtained in part (b).Part g
Dependent Variable: I3 |
||||
Method: Least Squares |
||||
Date: 06/08/18 Time: 13:02 |
||||
Sample: 1960 2016 |
||||
Included observations: 57 |
||||
HAC standard errors & covariance (Bartlett kernel, Newey-West fixed |
||||
bandwidth = 4.0000) |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
1.522099 |
0.677918 |
2.245255 |
0.0288 |
INF |
0.832006 |
0.131128 |
6.345005 |
0.0000 |
R-squared |
0.570322 |
Mean dependent var |
4.687295 |
|
Adjusted R-squared |
0.562510 |
S.D. dependent var |
3.109814 |
|
S.E. of regression |
2.056927 |
Akaike info criterion |
4.314760 |
|
Sum squared resid |
232.7021 |
Schwarz criterion |
4.386446 |
|
Log likelihood |
-120.9707 |
Hannan-Quinn criter. |
4.342620 |
|
F-statistic |
73.00279 |
Durbin-Watson stat |
0.421116 |
|
Prob(F-statistic) |
0.000000 |
Wald F-statistic |
40.25909 |
|
Prob(Wald F-statistic) |
0.000000 |
From the estimated regression result, the coefficient of inflation is obtained as 0.83. This means 1 percent increases in inflation rate increases return on T-bill rate by 0.8%. For 1 percent increase in inflation rate to cause a 1 percent increase in interest rate, the two variables need to be perfectly correlated that is having R square value equals to 1. From the regression coefficient and value of R square, the null hypothesis that 1 percent increase in inflation leads to a 1 percent increase in interest rate is rejected.
a.
The suspicion about heteroskadascity is reasonable as countries with a higher per capita GDP have access to a large amount of money to distribute. The people with a higher average income enjoy a higher flexibility regarding their spending on education. Countries with smaller per capita GDP has limited option for budget and hence, spending on education tend to vary less.
Figure 2: Scatter plot between x and y
The scatter plot between X and Y reveals that there exists a linear relationship between X and Y. This indicates presence of heteroskedasticity that is non-constant variance of error terms. In presence of heteroskadascity, variation in Y differs depending on the variation in X. From the scatter plot it is seen that small values of X leads to small scatter in Y while large values are associated with large scatter in Y.
The output OLS regression is given as follows
Dependent Variable: Y |
||||
Method: Least Squares |
||||
Date: 06/08/18 Time: 15:27 |
||||
Sample: 1 34 |
||||
Included observations: 34 |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
-0.124573 |
0.048523 |
-2.567308 |
0.0151 |
X |
0.073173 |
0.005179 |
14.12755 |
0.0000 |
R-squared |
0.861823 |
Mean dependent var |
0.476735 |
|
Adjusted R-squared |
0.857505 |
S.D. dependent var |
0.359903 |
|
S.E. of regression |
0.135858 |
Akaike info criterion |
-1.097394 |
|
Sum squared resid |
0.590635 |
Schwarz criterion |
-1.007608 |
|
Log likelihood |
20.65569 |
Hannan-Quinn criter. |
-1.066774 |
|
F-statistic |
199.5875 |
Durbin-Watson stat |
1.774258 |
|
Prob(F-statistic) |
0.000000 |
The coefficient of X is obtained as 0.07. The positive coefficient indicates a positive association between per capita education and per capita GDP. That is an increase in per capita GDP leads to an increase in per capita expenditure on education. With 10% increase in per capita GDP expenditure on education increases by 0.7 percent. P value of the coefficient is 0.00. As the p value is less than significance value of 0.05, it can be said that the coefficient is statistically valid. This means average income increase, people have more income to spend on education ad hence, expenditure on education increases.
Test of Heteroskedasticity: White test
Hypothesis
Null hypothesis: Variances for errors are equal
Alternative hypothesis: Variance of errors are not equal
In order to test constant variance auxiliary regression analysis is undertaken. The auxiliary regression regress squares of residuals from the original regression on a set of regressors containing regressors of the original model with their squares and cross product. One can then inspect R square value. The Lagrange multiplier test statistics is obtained as product of sample size and R square value.
The obtained test statistics follows a chi-square distribution with (P-1) degrees of freedom. P is the number of parameters in the auxiliary regression.
The null hypothesis is rejected if the p value of the chi square statistics is less than significance value of 0.05.
Conclusion
From the result of White test, p value of the chi-square statistics is 0.0000. As the p value is less than significance value of 0.05, the null hypothesis of homoskadastiity of error variance is rejected implying presence of hetetroskadascity in the model.
Heteroskedasticity Test: White |
||||
F-statistic |
6.423121 |
Prob. F(2,31) |
0.0046 |
|
Obs*R-squared |
9.961452 |
Prob. Chi-Square(2) |
0.0069 |
|
Scaled explained SS |
11.90755 |
Prob. Chi-Square(2) |
0.0026 |
|
Test Equation: |
||||
Dependent Variable: RESID^2 |
||||
Method: Least Squares |
||||
Date: 06/11/18 Time: 12:27 |
||||
Sample: 1 34 |
||||
Included observations: 34 |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
0.017677 |
0.016112 |
1.097134 |
0.2810 |
X^2 |
0.000484 |
0.000264 |
1.834593 |
0.0762 |
X |
-0.005206 |
0.004548 |
-1.144759 |
0.2611 |
R-squared |
0.292984 |
Mean dependent var |
0.017372 |
|
Adjusted R-squared |
0.247370 |
S.D. dependent var |
0.028968 |
|
S.E. of regression |
0.025131 |
Akaike info criterion |
-4.445344 |
|
Sum squared resid |
0.019578 |
Schwarz criterion |
-4.310665 |
|
Log likelihood |
78.57084 |
Hannan-Quinn criter. |
-4.399414 |
|
F-statistic |
6.423121 |
Durbin-Watson stat |
2.210357 |
|
Prob(F-statistic) |
0.004636 |
White consistent standard error
Dependent Variable: Y |
||||
Method: Least Squares |
||||
Date: 06/08/18 Time: 15:48 |
||||
Sample: 1 34 |
||||
Included observations: 34 |
||||
White heteroskedasticity-consistent standard errors & covariance |
||||
Variable |
Coefficient |
Std. Error |
t-Statistic |
Prob. |
C |
-0.124573 |
0.040414 |
-3.082420 |
0.0042 |
X |
0.073173 |
0.006212 |
11.78005 |
0.0000 |
R-squared |
0.861823 |
Mean dependent var |
0.476735 |
|
Adjusted R-squared |
0.857505 |
S.D. dependent var |
0.359903 |
|
S.E. of regression |
0.135858 |
Akaike info criterion |
-1.097394 |
|
Sum squared resid |
0.590635 |
Schwarz criterion |
-1.007608 |
|
Log likelihood |
20.65569 |
Hannan-Quinn criter. |
-1.066774 |
|
F-statistic |
199.5875 |
Durbin-Watson stat |
1.774258 |
|
Prob(F-statistic) |
0.000000 |
Wald F-statistic |
138.7696 |
|
Prob(Wald F-statistic) |
0.000000 |
The White consistent standard error model gives the same result as that obtained from heteroskadasticity consistent output in part (c).
Confidence interval
Coefficient Confidence Intervals |
||||
Date: 06/11/18 Time: 15:27 |
||||
Sample: 1 34 |
||||
Included observations: 34 |
||||
90% CI |
||||
Variable |
Coefficient |
Low |
High |
|
C |
-0.124573 |
-0.193030 |
-0.056116 |
|
X |
0.073173 |
0.062651 |
0.083695 |
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2020). Estimating The Effect Of Federally Funded School Lunch Program On Student Performance Using MEAP93 Data Essay.. Retrieved from https://myassignmenthelp.com/free-samples/econ634-econometrics-and-business-statistics/lunch-program-on-student-performance.html.
"Estimating The Effect Of Federally Funded School Lunch Program On Student Performance Using MEAP93 Data Essay.." My Assignment Help, 2020, https://myassignmenthelp.com/free-samples/econ634-econometrics-and-business-statistics/lunch-program-on-student-performance.html.
My Assignment Help (2020) Estimating The Effect Of Federally Funded School Lunch Program On Student Performance Using MEAP93 Data Essay. [Online]. Available from: https://myassignmenthelp.com/free-samples/econ634-econometrics-and-business-statistics/lunch-program-on-student-performance.html
[Accessed 24 November 2024].
My Assignment Help. 'Estimating The Effect Of Federally Funded School Lunch Program On Student Performance Using MEAP93 Data Essay.' (My Assignment Help, 2020) <https://myassignmenthelp.com/free-samples/econ634-econometrics-and-business-statistics/lunch-program-on-student-performance.html> accessed 24 November 2024.
My Assignment Help. Estimating The Effect Of Federally Funded School Lunch Program On Student Performance Using MEAP93 Data Essay. [Internet]. My Assignment Help. 2020 [cited 24 November 2024]. Available from: https://myassignmenthelp.com/free-samples/econ634-econometrics-and-business-statistics/lunch-program-on-student-performance.html.