Assignment on Linear Regression

Assignment on Linear Regression and STATA

Question 1

Before Beginning This Assignment, Please Be Sure To Read The Document Entitled “General Assignment Guidelines”. All References To Page Numbers, Equations, And Tables Are To The 5th Edition Of Dougherty. All Tests Should Be Conducted At The 5% Significance Level, Unless Specified Otherwise.

1. a) Suppose that the true linear regression model in a given situation is Now, assume that the researcher mistakenly believes that the true model is and that he estimates this model, accordingly. Prove that his (OLS) estimators of and will both be unbiased.

b) Suppose that the true linear regression model in a given situation is

Now, assume that the researcher mistakenly believes that the true model is, and that he estimates this model, accordingly. Prove that his (OLS) estimator of will be biased.

2. Consider the following hedonic price function model for 1987 house prices in Windsor, Ontario:

where: = price (dollars), = lot size in square feet, = number of bedrooms, = number of bathrooms, = 1 if full basement present, = 0 otherwise.

a) Using the data in the Excel file housedata.xlsx, together with the appropriate STATA commands, estimate this model. Cut and paste your output into your assignment.

b) Carefully interpret the estimated slope parameters. HINT: Think about the various different functional forms discussed in Section 4.2. Also, be sure to read Box 5.1. NOTE: There is a typo in Box 5.1. The sentence beginning “If is small, is approximately equal to , implying …” should read “If is small, is approximately equal to , implying …”

c) Do the signs of the estimated slope parameters accord with your prior expectations? Explain.

d) Are the estimated slope parameters individually statistically significant? Explain.

e) Use the Chow test, as described on pp. 255-258, to test the “parameter stability” of this cross-sectional model. Divide the sample into two halves, so that the first half consists of the first 273 observations and the second half consists of the remaining 273 observations. Explain.

NOTE: One way to estimate a model in STATA using only the first 273 observations would be to use the following STATA code:

gen z = _n
reg y x2 x3 x4 if z <= 273

This code creates a new variable, z, which is equal to the observation number, and then estimates a (general) model using only the first 273 observations in the sample. You can adapt this code, as appropriate, for the present problem.

NOTE: You may NOT use the STATA “test” command for part f).

f) Redo this Chow test, using the alternative dummy variable framework, described on pp. 252-253. Do you get the same result? Explain.

g) Considering all of your results above, together with any additional insights that you may be able to glean from the “reg” output in part a), assess the overall quality of this model.

3. Consider the regression model discussed in Question 5.21.

a) Using the data from the EAWE12.dta dataset, together with the appropriate STATA commands, estimate this model. Cut and paste your output into your assignment.

b) Carefully interpret the estimated slope parameters.

c) Do the signs of the estimated slope parameters accord with your prior expectations? Explain.

d) Are the estimated slope parameters individually statistically significant? Explain.

NOTE: You may NOT use the STATA “test” command for part e).

e) Test to see whether the two interaction terms belong in the model. (Use a joint test.)

NOTE: You may NOT use the STATA “test” command or the STATA “ovtest” command for part f).

f) Use the version of the RESET test involving BOTH the squared and the cubed predictions of the dependent variable to assess the specification of the model. Explain.

g) Considering all of your results above, together with any additional insights that you may be able to glean from the “reg” output in part a), assess the overall quality of the linear model.

4. Consider Question 5.24.

a) Provide an interpretation of the estimated parameters reported in Column (1).

b) As far as this is possible, determine the numerical values indicated by the letters in Column (2). NOTE: The standard error for the estimated parameter associated with MALE in Column (1) should be 5.58, rather than 4.99.