Get Instant Help From 5000+ Experts For

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing:Proofread your work by experts and improve grade at Lowest cost

## Descriptive Statistics for Quantitative Variables

i.Present appropriate descriptive statistics for the quantitative variables that you have downloaded. Remember to only provide appropriate statistics (mean, median, standard deviation, variance, min, max). Briefly describe in words what these descriptive statistics show, paying attention to the measures of central location (and thus symmetry/skewness) and the range.

ii.Present appropriate graphs of the relationship between house prices (our outcome of interest) and the four quantitative variables in the sample that theory and common sense would suggest will affect prices: lot size, number of bedrooms, number of bathrooms, and number of garages. Provide a brief description of what your graphs show. Do the relationships between the variables look linear or not?

iii.Estimate the relationship between house prices and the four quantitative variables that you considered in part (ii) using OLS. Write down the equation for the population model you are estimating. Report the estimates for your model in a table (Excel output table is fine) and write down the estimated model. Provide a full interpretation in words for each of the slope coefficients that you have estimated, paying particular attention to whether the relationships are significant or not. Finally, comment on the fit of the model you have estimated while briefly explaining and interpreting the goodness of fit measures.

iv.Suppose a house price owner in the area is thinking of building an additional bathroom before selling his house to achieve a higher price. However, since building a bathroom is quite costly it would only be worthwhile if it would increase the house price by over 16,000 CAD. Undertake a t-test to check whether the owner should build the additional bathroom or not. Do the six steps of the test and use the Excel output to help you construct the test statistic. (You can assume that all the required conditions hold.)

v.The model in part assumes that the price effect of an additional bathroom does not depend on the size of the house. While we don't have information on the size of the house, we have information on the size of the lot which we can use as a proxy variable  for house size. Create an interaction variable (bath_size = bath*lot_siz in Excel) and then re-estimate the model from part (iii) including the new interaction variable (i.e. in addition to the quantitative variables and intercept which were present in the model in Report the Excel output table of estimation results.

## Graphs depicting the relationship between house prices and quantitative variables

Write down the estimated model and interpret the coefficient on the interaction term. Find the expression for the effect of an additional bathroom on house price and then compute this effect for a house of average lot size. Interpret this result.

vi.List the five required conditions for using Ordinary Least Squares. Which of these required conditions can you check for your particular model estimated in part (v)? For those required conditions that you cannot check, describe why they cannot be checked. For the required conditions you can check, provide evidence of whether the required conditions hold or not. For any required conditions that you do not believe to hold here, briefly describe the specific consequences for your OLS results.

vii.Do you think we have multicollinearity in the model? Check the signs for multicollinearity to explain your answer.

viii.The data set contains an additional indicator variable that describes whether the house has a basement. We would think that this is a factor that may affect the price of a house and should therefore be included in the model.

Re-estimate the relationship for house prices from part (v) adding the indicator variable on basement as a regressor (i.e. in addition to your quantitative variables, interaction variable and intercept in part (v)). Report the estimates for your model in a table (Excel output table is fine) and write down the estimated model.

Provide a full interpretation of the coefficient for the newly included nominal variable. Comment on the fit of the model you have just estimated and compare it to the fit of the model from part (v). Using the Excel output, what can you say about the validity of the model?

ix.Using your estimates from part (viii), construct a point prediction for the price of a house with the following fairly typical characteristics: lot size of 5000 square feet, 3 bedrooms, 1 bathroom and 1 garage. Assume the house has no basement. Then construct a point prediction for the price of a house with the same characteristics as before, but with a basement now. Compare the two predictions and briefly discuss what you find.

x.Think of a question from finance, marketing, management, economics or accounting that can be answered with a multiple regression analysis. Write down the question, what data you would need to collect (what variables) and the population model that you would estimate. Briefly explain how the OLS estimates would help you answer your question.

## Descriptive Statistics for Quantitative Variables

Here are the descriptive statistics for sales price and lot size.

 price lot_siz Mean 67.7689 5104.691 Standard Error 1.12846 93.41192 Median 62 4505 Mode 60 6000 Standard Deviation 26.07696 2158.604 Sample Variance 680.0079 4659570 Kurtosis 1.564798 3.039063 Skewness 1.107003 1.389213 Range 165 14550 Minimum 25 1650 Maximum 190 16200 Sum 36188.59 2725905 Count 534 534

Price: the average price is 67.7689 or \$67768 as we are given data in \$1000 units. The standard deviation is very low at 1.128. the data is positively skewed, but only moderately at 1.107 value for skewness. The minimum price is \$25000 and maximum is \$190000, there is little difference between median and mode, though mean exceeds both of these values.

Lot size: the average lot size is 5104.69 square feet. This lies between mode of 6000 and median of 4505 square feet. The data is positively skewed with a value of 1.38. The range is very large at 14550 square feet.

We show plots for association between

Price and lot size: the first scatterplot shows a positive association between the variables. This is shown by the upward sloping trend line. The degree of association is low as R2 value is only 0.293. Only 29% of variation in lot size explains the variation in prices. We need other factors that impact on prices.

Price and no of bedrooms: we have the option of 1. 2 or 3 bedrooms. The highest price is for the 3 bedroom house.

 no of bedrooms no of houses 1 2 2 135 3 294 4 91

Price and no of bathrooms: the least number of homes have 3 bathrooms, with 1 bathroom being most common in the sample as we show in table below. The dots for 1 bathroom was maximum in the scatterplot as well. .

 no of bathrooms no of houses 1 394 2 130 3 10 4 0

Price and no of garages.300 out of 535 samples have zero garages.

None of the relations is linear as the data is discrete in 3 variables- bedrooms, bathrooms and garages. For lot sizes linearity is weak as the value of R2 tells us.

The regression results are below:

 Regression Statistics Multiple R 0.715192 R Square 0.5115 Adjusted R Square 0.507806 Standard Error 18.29469 Observations 534 ANOVA df SS MS F Significance F Regression 4 185390.2 46347.55 138.4767 6.9E-81 Residual 529 177054 334.6957 Total 533 362444.2 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 0.235646 3.659526 0.064392 0.948682 -6.95334 7.424633 -6.95334 7.424633 lot_siz 0.004787 0.000393 12.17072 3.23E-30 0.004014 0.00556 0.004014 0.00556 bed 5.54546 1.161663 4.773724 2.34E-06 3.263421 7.8275 3.263421 7.8275 bath 17.80661 1.770592 10.05687 6.85E-22 14.32836 21.28487 14.32836 21.28487 Gar 6.058494 1.061021 5.71006 1.88E-08 3.974162 8.142826 3.974162 8.142826

The overall fit is moderately good as R2 is .71

The high F value of 138.7 implies that the overall model is significant.

The coefficient of lot size is 0.0047 which means that for every 1 square feet increase in lot size the price rises by .0047*1000 =4.7\$. The coefficient is significant as p value is almost zero.

The coefficient of bedrooms is 5.5454 which means that for every additional bedroom the price rises by 5.5454*1000 =\$ 5545.4 The coefficient is significant as p value is almost zero.

The coefficient of bathrooms is 17.8066 which means that for every additional bathroom the price rises by 17.8066*1000 =\$ 17806.6 The coefficient is significant as p value is almost zero.

## Graphs depicting the relationship between house prices and quantitative variables

The coefficient of garage is 6.0854 which means that for every additional garage the price rises by 6.0854*1000 =\$ 6085.4 The coefficient is significant as p value is almost zero.

The marginal effect of bathroom is maximum among all variables.

The t test for bathroom variable is as follows:

Step 1:

Ho: coefficient for bathroom is = 0

H1: coefficient for bathroom is ≠ 0

Step 2:

Set significance level = 0.05

Step 3:

The t value in the result is 10.05 with p value of almost zero.

Step 4:

Compare p value with 0.05.  p value < 0.05

Step 5:

As p value is lower the coefficient is significant.

Step 6: the bathroom variable must be part of the regression.

The coefficient value is 17.8066, which means that every additional bathroom will add 17.8066*1000 = 17806 to the expected price of the house. As this is higher than cost of construction at 16000, an additional bathroom must be constructed.

The regression results are :

 Regression Statistics Multiple R 0.718119 R Square 0.515694 Adjusted R Square 0.511108 Standard Error 18.23322 Observations 534 ANOVA df SS MS F Significance F Regression 5 186910.4 37382.08 112.4441 9E-81 Residual 528 175533.8 332.4504 Total 533 362444.2 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 10.49715 6.027468 1.741552 0.082169 -1.34361 22.33791 -1.34361 22.33791 lot_siz 0.002977 0.000933 3.191357 0.0015 0.001144 0.00481 0.001144 0.00481 bed 5.381932 1.160283 4.638464 4.43E-06 3.102594 7.661271 3.102594 7.661271 bath 10.15036 3.991663 2.54289 0.011278 2.30887 17.99185 2.30887 17.99185 gar 6.251165 1.061288 5.890168 6.87E-09 4.1663 8.336031 4.1663 8.336031 BATH*LOTSIZ 0.001358 0.000635 2.138367 0.032945 0.00011 0.002606 0.00011 0.002606

The estimated model is price (P)

P = 10.49 +.002977*lot size +5.38*bedroom +10.10*bathroom +6.25*garage +.001358*(bath*lot size)

An additional bathroom, with average lot size of 5104.69 means that price will change by

10.10 +.001358*5104.69 = 17.032 or 17.032*1000 = \$17032. Price rises by \$17032 with an additional bathroom for an average lot size.

The coefficient of interaction term is 0.001358. This is significant at 5% level as p value is 0.03, which is less than 5%. At 10% this coefficient will  be insignificant as 0.1 >0.03. as the coefficient is positive it tells us that the effect of lot size and bathrooms are not independent of each other. A unit change in lot size will lead to price change of .0029+.00135=  0.0043 or 4.3\$. in the same way an additional bathroom now increases price by 10.15+.0013 = 10.151 or \$10151.

The assumptions/conditions are:

Model : Y = a +b1*X1+ b2*X2 + b3*X3+……… bn*Xn+ error

• The model is linear in the relationship between explanatory and dependent variables.
• The error terms are normally distributed
• There is no correlation between the error terms
• The variance of errors is equal for all observations( 1 to n).

The correlation matrix is given below. The values of correlation do not exceed .5 in any set of explanatory variables, which shows lack of multicollinearity. A value of .54 exists between the dependent variable (price) and lot size, which is not concerned with multicollinearity.

 price lot_siz bed bath gar price 1 lot_siz 0.541918 1 bed 0.364055 0.146674 1 bath 0.498301 0.184399 0.37413 1 gar 0.393369 0.328838 0.127626 0.172597 1
 SUMMARY OUTPUT Regression Statistics Multiple R 0.728089 R Square 0.530113 Adjusted R Square 0.524763 Standard Error 17.97678 Observations 534 ANOVA df SS MS F Significance F Regression 6 192136.4 32022.74 99.09109 3.69E-83 Residual 527 170307.8 323.1647 Total 533 362444.2 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 12.24988 5.958656 2.055812 0.040293 0.544242 23.95551 0.544242 23.95551 lot_siz 0.002521 0.000927 2.720191 0.00674 0.0007 0.004341 0.0007 0.004341 bed 5.048108 1.146972 4.401246 1.3E-05 2.794908 7.301307 2.794908 7.301307 bath 7.871628 3.976107 1.979732 0.048253 0.060663 15.68259 0.060663 15.68259 gar 6.154036 1.04664 5.8798 7.3E-09 4.097937 8.210135 4.097937 8.210135 BATH*LOTSIZ 0.001687 0.000632 2.670411 0.00781 0.000446 0.002927 0.000446 0.002927 b_ment 6.667565 1.658029 4.02138 6.63E-05 3.410407 9.924722 3.410407 9.924722

P= 12.25 +.00252*lot size +5.048*bedroom +7.87*bathroom +6.45*garage+6.667*basement +.001687*(bath*lot size)

The coefficient of basement is positive which means that having a basement adds to price. The extent of addition is 6.667*1000 = or \$6667.

To compare the model with and without basement variable we must look at the significance of the added variable as well the effect on adjusted R^2 value. Note that the p value of the coefficient of basement is almost zero which makes it significant.

Also a simple comparison of R^2 values show an improvement from 0.71 to 0.72- a marginal increase. The value of adjusted R^2 is also marginally better with the expanded model.( 052 against 0.51 in smaller model)

A point prediction will use all values of coefficients along with given values of the explanatory variables:

Predicted value of price = 12.25 +.00252*lot size +5.048*bedroom +7.87*bathroom +6.45*garage+6.667*basement +.001687*(bath*lot size)

=12.25 +.00252*5000 +5.048*3 +7.87*1+6.45*1+6.667*0 +.001687*(3*5000) =79.619 or \$79619.

Now we include a basement which gives the model:

Predicted value of price = = 12.25 +.00252*lot size +5.048*bedroom +7.87*bathroom +6.45*garage+6.667*basement +.001687*(bath*lot size)+6.667*basement

= 0 .25 +.00252*5000 +5.048*3 +7.87*1+6.45*1+6.667*0 +.001687*(3*5000) +6.6667*1 =82.2865

or \$ 82286.5.

the additional price is equal to the coefficient of[ basement *1000] as the units of price are in 1000 dollars.

From a marketing perspective it maybe sensible to ask which feature in a home will fetch the highest price. These features include bathroom, garage, bedroom or basement. Depending on the model we choose the highest coefficient of these variables will tell us the feature that fetches highest price. The marketing team can then sell this feature more aggressively as buyers will be more willing to pay for it based on the datta given to us.

References

Regression analysis , n.d, available from https://www.statgraphics.com/regression-analysis

https://hbr.org/2015/11/a-refresher-on-regression-analysis  [20 May 2017]

Gall. A ( Nov 4, 2015), A refresher course on Regression analysis ,available from

https://hbr.org/2015/11/a-refresher-on-regression-analysis  [20 May 2017]

What is linear regression, n.d, available from https://www.statisticssolutions.com/what-is-linear-regression/ [21 may 2017]

Frost. J ( 12 Dec 2013), Regression Analysis Tutorial and examples, available from https://blog.minitab.com/blog/adventures-in-statistics-2/regression-analysis-tutorial-and-examples

Hypothesis test for regression slope, n.d., available from https://stattrek.com/regression/slope-test.aspx?Tutorial=AP  [21 May 2017]

Real Statistics using excel, n.d., available from https://www.real-statistics.com/regression/hypothesis-testing-significance-regression-line-slope/   [21 May 2017]

6.4 The hypothesis test for slopes, n.d., available from https://onlinecourses.science.psu.edu/stat501/node/297   [21 May 2017]

Statistics how to, n.d., available from https://www.statisticshowto.com/what-is-a-coefficient-of-determination/   [21 May 2017]

Coefficient of Determination , n.d., available from https://www.investopedia.com/terms/c/coefficient-of-determination.asp   [21 May 2017]

1.5 The Coefficient of Detrmination, n.d., available from https://onlinecourses.science.psu.edu/stat501/node/255   [21 May 2017]

Statistics and probability dictionary , n.d., available from https://stattrek.com/statistics/dictionary.aspx?definition=coefficient_of_determination [22 May 2017]

Common mistakes in interpretation of regression coefficients , n.d., available from https://www.ma.utexas.edu/users/mks/statmistakes/regressioncoeffs.html [22 May 2017]

Regression , n.d., available from https://www.statisticssolutions.com/directory-of-statistical-analyses-regression-analysis/regression/ [20 May 2017]

Interpreting egression coefficients , n.d., available from https://www.cscu.cornell.edu/news/statnews/stnews39.pdf [21 May 2017]

Interpreting egression coefficients, n.d., available from https://paws.wcu.edu/jarrell/mehocoef.htm [19 May 2017]

Linear regression models , n.d., available from https://people.duke.edu/~rnau/testing.htm [20 May 2017]

Cite This Work

To export a reference to this article please select a referencing stye below:

My Assignment Help. (2021). Multiple Regression Analysis On House Prices With OLS. Retrieved from https://myassignmenthelp.com/free-samples/ecom90009-quantitative-methods-for-business/descriptive-statistics-for-quantitative-of-marketing.html.

My Assignment Help (2021) Multiple Regression Analysis On House Prices With OLS [Online]. Available from: https://myassignmenthelp.com/free-samples/ecom90009-quantitative-methods-for-business/descriptive-statistics-for-quantitative-of-marketing.html
[Accessed 03 August 2024].

My Assignment Help. 'Multiple Regression Analysis On House Prices With OLS' (My Assignment Help, 2021) <https://myassignmenthelp.com/free-samples/ecom90009-quantitative-methods-for-business/descriptive-statistics-for-quantitative-of-marketing.html> accessed 03 August 2024.

My Assignment Help. Multiple Regression Analysis On House Prices With OLS [Internet]. My Assignment Help. 2021 [cited 03 August 2024]. Available from: https://myassignmenthelp.com/free-samples/ecom90009-quantitative-methods-for-business/descriptive-statistics-for-quantitative-of-marketing.html.

Get instant help from 5000+ experts for

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing: Proofread your work by experts and improve grade at Lowest cost

250 words