Get Instant Help From 5000+ Experts For
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing:Proofread your work by experts and improve grade at Lowest cost

And Improve Your Grades
myassignmenthelp.com
loader
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Guaranteed Higher Grade!
Free Quote
wave

Descriptive Statistics for Quantitative Variables

i.Present appropriate descriptive statistics for the quantitative variables that you have downloaded. Remember to only provide appropriate statistics (mean, median, standard deviation, variance, min, max). Briefly describe in words what these descriptive statistics show, paying attention to the measures of central location (and thus symmetry/skewness) and the range.

ii.Present appropriate graphs of the relationship between house prices (our outcome of interest) and the four quantitative variables in the sample that theory and common sense would suggest will affect prices: lot size, number of bedrooms, number of bathrooms, and number of garages. Provide a brief description of what your graphs show. Do the relationships between the variables look linear or not?

iii.Estimate the relationship between house prices and the four quantitative variables that you considered in part (ii) using OLS. Write down the equation for the population model you are estimating. Report the estimates for your model in a table (Excel output table is fine) and write down the estimated model. Provide a full interpretation in words for each of the slope coefficients that you have estimated, paying particular attention to whether the relationships are significant or not. Finally, comment on the fit of the model you have estimated while briefly explaining and interpreting the goodness of fit measures.

iv.Suppose a house price owner in the area is thinking of building an additional bathroom before selling his house to achieve a higher price. However, since building a bathroom is quite costly it would only be worthwhile if it would increase the house price by over 16,000 CAD. Undertake a t-test to check whether the owner should build the additional bathroom or not. Do the six steps of the test and use the Excel output to help you construct the test statistic. (You can assume that all the required conditions hold.)

v.The model in part assumes that the price effect of an additional bathroom does not depend on the size of the house. While we don't have information on the size of the house, we have information on the size of the lot which we can use as a proxy variable  for house size. Create an interaction variable (bath_size = bath*lot_siz in Excel) and then re-estimate the model from part (iii) including the new interaction variable (i.e. in addition to the quantitative variables and intercept which were present in the model in Report the Excel output table of estimation results.

Graphs depicting the relationship between house prices and quantitative variables

Write down the estimated model and interpret the coefficient on the interaction term. Find the expression for the effect of an additional bathroom on house price and then compute this effect for a house of average lot size. Interpret this result.

vi.List the five required conditions for using Ordinary Least Squares. Which of these required conditions can you check for your particular model estimated in part (v)? For those required conditions that you cannot check, describe why they cannot be checked. For the required conditions you can check, provide evidence of whether the required conditions hold or not. For any required conditions that you do not believe to hold here, briefly describe the specific consequences for your OLS results.

vii.Do you think we have multicollinearity in the model? Check the signs for multicollinearity to explain your answer.

viii.The data set contains an additional indicator variable that describes whether the house has a basement. We would think that this is a factor that may affect the price of a house and should therefore be included in the model.

Re-estimate the relationship for house prices from part (v) adding the indicator variable on basement as a regressor (i.e. in addition to your quantitative variables, interaction variable and intercept in part (v)). Report the estimates for your model in a table (Excel output table is fine) and write down the estimated model.

Provide a full interpretation of the coefficient for the newly included nominal variable. Comment on the fit of the model you have just estimated and compare it to the fit of the model from part (v). Using the Excel output, what can you say about the validity of the model?

ix.Using your estimates from part (viii), construct a point prediction for the price of a house with the following fairly typical characteristics: lot size of 5000 square feet, 3 bedrooms, 1 bathroom and 1 garage. Assume the house has no basement. Then construct a point prediction for the price of a house with the same characteristics as before, but with a basement now. Compare the two predictions and briefly discuss what you find.

x.Think of a question from finance, marketing, management, economics or accounting that can be answered with a multiple regression analysis. Write down the question, what data you would need to collect (what variables) and the population model that you would estimate. Briefly explain how the OLS estimates would help you answer your question.

Descriptive Statistics for Quantitative Variables

Here are the descriptive statistics for sales price and lot size.

price

 lot_siz

Mean

67.7689

5104.691

Standard Error

1.12846

93.41192

Median

62

4505

Mode

60

6000

Standard Deviation

26.07696

2158.604

Sample Variance

680.0079

4659570

Kurtosis

1.564798

3.039063

Skewness

1.107003

1.389213

Range

165

14550

Minimum

25

1650

Maximum

190

16200

Sum

36188.59

2725905

Count

534

534

Price: the average price is 67.7689 or $67768 as we are given data in $1000 units. The standard deviation is very low at 1.128. the data is positively skewed, but only moderately at 1.107 value for skewness. The minimum price is $25000 and maximum is $190000, there is little difference between median and mode, though mean exceeds both of these values.

Lot size: the average lot size is 5104.69 square feet. This lies between mode of 6000 and median of 4505 square feet. The data is positively skewed with a value of 1.38. The range is very large at 14550 square feet.

We show plots for association between

Price and lot size: the first scatterplot shows a positive association between the variables. This is shown by the upward sloping trend line. The degree of association is low as R2 value is only 0.293. Only 29% of variation in lot size explains the variation in prices. We need other factors that impact on prices.

Price and no of bedrooms: we have the option of 1. 2 or 3 bedrooms. The highest price is for the 3 bedroom house.  

no of bedrooms

no of houses

1

2

2

135

3

294

4

91

Price and no of bathrooms: the least number of homes have 3 bathrooms, with 1 bathroom being most common in the sample as we show in table below. The dots for 1 bathroom was maximum in the scatterplot as well. .

no of bathrooms

no of houses

1

394

2

130

3

10

4

0

Price and no of garages.300 out of 535 samples have zero garages.

None of the relations is linear as the data is discrete in 3 variables- bedrooms, bathrooms and garages. For lot sizes linearity is weak as the value of R2 tells us.

The regression results are below:

Regression Statistics

Multiple R

0.715192

R Square

0.5115

Adjusted R Square

0.507806

Standard Error

18.29469

Observations

534

ANOVA

df

SS

MS

F

Significance F

Regression

4

185390.2

46347.55

138.4767

6.9E-81

Residual

529

177054

334.6957

Total

533

362444.2

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

0.235646

3.659526

0.064392

0.948682

-6.95334

7.424633

-6.95334

7.424633

 lot_siz

0.004787

0.000393

12.17072

3.23E-30

0.004014

0.00556

0.004014

0.00556

bed

5.54546

1.161663

4.773724

2.34E-06

3.263421

7.8275

3.263421

7.8275

bath

17.80661

1.770592

10.05687

6.85E-22

14.32836

21.28487

14.32836

21.28487

Gar

6.058494

1.061021

5.71006

1.88E-08

3.974162

8.142826

3.974162

8.142826

 The overall fit is moderately good as R2 is .71

The high F value of 138.7 implies that the overall model is significant.

The coefficient of lot size is 0.0047 which means that for every 1 square feet increase in lot size the price rises by .0047*1000 =4.7$. The coefficient is significant as p value is almost zero.

The coefficient of bedrooms is 5.5454 which means that for every additional bedroom the price rises by 5.5454*1000 =$ 5545.4 The coefficient is significant as p value is almost zero.

The coefficient of bathrooms is 17.8066 which means that for every additional bathroom the price rises by 17.8066*1000 =$ 17806.6 The coefficient is significant as p value is almost zero.

Graphs depicting the relationship between house prices and quantitative variables

The coefficient of garage is 6.0854 which means that for every additional garage the price rises by 6.0854*1000 =$ 6085.4 The coefficient is significant as p value is almost zero.

The marginal effect of bathroom is maximum among all variables.

The t test for bathroom variable is as follows:

Step 1:

Ho: coefficient for bathroom is = 0

H1: coefficient for bathroom is ≠ 0

Step 2:

Set significance level = 0.05

Step 3:

The t value in the result is 10.05 with p value of almost zero.

Step 4:

Compare p value with 0.05.  p value < 0.05

Step 5:

 As p value is lower the coefficient is significant.

Step 6: the bathroom variable must be part of the regression.

The coefficient value is 17.8066, which means that every additional bathroom will add 17.8066*1000 = 17806 to the expected price of the house. As this is higher than cost of construction at 16000, an additional bathroom must be constructed.

The regression results are :

Regression Statistics

Multiple R

0.718119

R Square

0.515694

Adjusted R Square

0.511108

Standard Error

18.23322

Observations

534

ANOVA

df

SS

MS

F

Significance F

Regression

5

186910.4

37382.08

112.4441

9E-81

Residual

528

175533.8

332.4504

Total

533

362444.2

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

10.49715

6.027468

1.741552

0.082169

-1.34361

22.33791

-1.34361

22.33791

 lot_siz

0.002977

0.000933

3.191357

0.0015

0.001144

0.00481

0.001144

0.00481

bed

5.381932

1.160283

4.638464

4.43E-06

3.102594

7.661271

3.102594

7.661271

bath

10.15036

3.991663

2.54289

0.011278

2.30887

17.99185

2.30887

17.99185

gar

6.251165

1.061288

5.890168

6.87E-09

4.1663

8.336031

4.1663

8.336031

BATH*LOTSIZ

0.001358

0.000635

2.138367

0.032945

0.00011

0.002606

0.00011

0.002606

The estimated model is price (P)

P = 10.49 +.002977*lot size +5.38*bedroom +10.10*bathroom +6.25*garage +.001358*(bath*lot size)

An additional bathroom, with average lot size of 5104.69 means that price will change by

 10.10 +.001358*5104.69 = 17.032 or 17.032*1000 = $17032. Price rises by $17032 with an additional bathroom for an average lot size.

The coefficient of interaction term is 0.001358. This is significant at 5% level as p value is 0.03, which is less than 5%. At 10% this coefficient will  be insignificant as 0.1 >0.03. as the coefficient is positive it tells us that the effect of lot size and bathrooms are not independent of each other. A unit change in lot size will lead to price change of .0029+.00135=  0.0043 or 4.3$. in the same way an additional bathroom now increases price by 10.15+.0013 = 10.151 or $10151.

The assumptions/conditions are:

Model : Y = a +b1*X1+ b2*X2 + b3*X3+……… bn*Xn+ error

  • The model is linear in the relationship between explanatory and dependent variables.
  • The error terms are normally distributed
  • There is no correlation between the error terms
  • The variance of errors is equal for all observations( 1 to n).

The correlation matrix is given below. The values of correlation do not exceed .5 in any set of explanatory variables, which shows lack of multicollinearity. A value of .54 exists between the dependent variable (price) and lot size, which is not concerned with multicollinearity.

price

 lot_siz

bed

bath

gar

price

1

 lot_siz

0.541918

1

bed

0.364055

0.146674

1

bath

0.498301

0.184399

0.37413

1

gar

0.393369

0.328838

0.127626

0.172597

1

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.728089

R Square

0.530113

Adjusted R Square

0.524763

Standard Error

17.97678

Observations

534

ANOVA

df

SS

MS

F

Significance F

Regression

6

192136.4

32022.74

99.09109

3.69E-83

Residual

527

170307.8

323.1647

Total

533

362444.2

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

12.24988

5.958656

2.055812

0.040293

0.544242

23.95551

0.544242

23.95551

 lot_siz

0.002521

0.000927

2.720191

0.00674

0.0007

0.004341

0.0007

0.004341

bed

5.048108

1.146972

4.401246

1.3E-05

2.794908

7.301307

2.794908

7.301307

bath

7.871628

3.976107

1.979732

0.048253

0.060663

15.68259

0.060663

15.68259

gar

6.154036

1.04664

5.8798

7.3E-09

4.097937

8.210135

4.097937

8.210135

BATH*LOTSIZ

0.001687

0.000632

2.670411

0.00781

0.000446

0.002927

0.000446

0.002927

b_ment

6.667565

1.658029

4.02138

6.63E-05

3.410407

9.924722

3.410407

9.924722

P= 12.25 +.00252*lot size +5.048*bedroom +7.87*bathroom +6.45*garage+6.667*basement +.001687*(bath*lot size)

The coefficient of basement is positive which means that having a basement adds to price. The extent of addition is 6.667*1000 = or $6667.

To compare the model with and without basement variable we must look at the significance of the added variable as well the effect on adjusted R^2 value. Note that the p value of the coefficient of basement is almost zero which makes it significant.

Also a simple comparison of R^2 values show an improvement from 0.71 to 0.72- a marginal increase. The value of adjusted R^2 is also marginally better with the expanded model.( 052 against 0.51 in smaller model)

A point prediction will use all values of coefficients along with given values of the explanatory variables:

Predicted value of price = 12.25 +.00252*lot size +5.048*bedroom +7.87*bathroom +6.45*garage+6.667*basement +.001687*(bath*lot size)

=12.25 +.00252*5000 +5.048*3 +7.87*1+6.45*1+6.667*0 +.001687*(3*5000) =79.619 or $79619.

 Now we include a basement which gives the model:

Predicted value of price = = 12.25 +.00252*lot size +5.048*bedroom +7.87*bathroom +6.45*garage+6.667*basement +.001687*(bath*lot size)+6.667*basement

= 0 .25 +.00252*5000 +5.048*3 +7.87*1+6.45*1+6.667*0 +.001687*(3*5000) +6.6667*1 =82.2865

 or $ 82286.5.

the additional price is equal to the coefficient of[ basement *1000] as the units of price are in 1000 dollars.

From a marketing perspective it maybe sensible to ask which feature in a home will fetch the highest price. These features include bathroom, garage, bedroom or basement. Depending on the model we choose the highest coefficient of these variables will tell us the feature that fetches highest price. The marketing team can then sell this feature more aggressively as buyers will be more willing to pay for it based on the datta given to us.  

References

Regression analysis , n.d, available from https://www.statgraphics.com/regression-analysis

https://hbr.org/2015/11/a-refresher-on-regression-analysis  [20 May 2017]

Gall. A ( Nov 4, 2015), A refresher course on Regression analysis ,available from

https://hbr.org/2015/11/a-refresher-on-regression-analysis  [20 May 2017]

What is linear regression, n.d, available from https://www.statisticssolutions.com/what-is-linear-regression/ [21 may 2017]

Frost. J ( 12 Dec 2013), Regression Analysis Tutorial and examples, available from https://blog.minitab.com/blog/adventures-in-statistics-2/regression-analysis-tutorial-and-examples

Hypothesis test for regression slope, n.d., available from https://stattrek.com/regression/slope-test.aspx?Tutorial=AP  [21 May 2017]

Real Statistics using excel, n.d., available from https://www.real-statistics.com/regression/hypothesis-testing-significance-regression-line-slope/   [21 May 2017]

6.4 The hypothesis test for slopes, n.d., available from https://onlinecourses.science.psu.edu/stat501/node/297   [21 May 2017]

Statistics how to, n.d., available from https://www.statisticshowto.com/what-is-a-coefficient-of-determination/   [21 May 2017]

Coefficient of Determination , n.d., available from https://www.investopedia.com/terms/c/coefficient-of-determination.asp   [21 May 2017]

1.5 The Coefficient of Detrmination, n.d., available from https://onlinecourses.science.psu.edu/stat501/node/255   [21 May 2017]

Statistics and probability dictionary , n.d., available from https://stattrek.com/statistics/dictionary.aspx?definition=coefficient_of_determination [22 May 2017]

Common mistakes in interpretation of regression coefficients , n.d., available from https://www.ma.utexas.edu/users/mks/statmistakes/regressioncoeffs.html [22 May 2017]

Regression , n.d., available from https://www.statisticssolutions.com/directory-of-statistical-analyses-regression-analysis/regression/ [20 May 2017]

Interpreting egression coefficients , n.d., available from https://www.cscu.cornell.edu/news/statnews/stnews39.pdf [21 May 2017]

Interpreting egression coefficients, n.d., available from https://paws.wcu.edu/jarrell/mehocoef.htm [19 May 2017]

Linear regression models , n.d., available from https://people.duke.edu/~rnau/testing.htm [20 May 2017]

Cite This Work

To export a reference to this article please select a referencing stye below:

My Assignment Help. (2021). Multiple Regression Analysis On House Prices With OLS. Retrieved from https://myassignmenthelp.com/free-samples/ecom90009-quantitative-methods-for-business/descriptive-statistics-for-quantitative-of-marketing.html.

My Assignment Help (2021) Multiple Regression Analysis On House Prices With OLS [Online]. Available from: https://myassignmenthelp.com/free-samples/ecom90009-quantitative-methods-for-business/descriptive-statistics-for-quantitative-of-marketing.html
[Accessed 19 April 2024].

My Assignment Help. 'Multiple Regression Analysis On House Prices With OLS' (My Assignment Help, 2021) <https://myassignmenthelp.com/free-samples/ecom90009-quantitative-methods-for-business/descriptive-statistics-for-quantitative-of-marketing.html> accessed 19 April 2024.

My Assignment Help. Multiple Regression Analysis On House Prices With OLS [Internet]. My Assignment Help. 2021 [cited 19 April 2024]. Available from: https://myassignmenthelp.com/free-samples/ecom90009-quantitative-methods-for-business/descriptive-statistics-for-quantitative-of-marketing.html.

Get instant help from 5000+ experts for
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing: Proofread your work by experts and improve grade at Lowest cost

loader
250 words
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Plagiarism checker
Verify originality of an essay
essay
Generate unique essays in a jiffy
Plagiarism checker
Cite sources with ease
support
Whatsapp
callback
sales
sales chat
Whatsapp
callback
sales chat
close