Having in mind the main research question, select the best linear regression model, using the least squares method with backwards stepwise variable elimination, at α = 1%. Describe step by step your analysis providing in each step the relevant results.
Give the interpretation of the regression coefficients of the selected model. Challenge the feasibility of the sign and magnitude of the coefficients in your model and if necessary try and propose an alternative.
For the selected model, calculate the coefficient of determination and give its interpretation in terms of the given research question.
Using the natural logarithm transformation of all but the dummy variables, repeat the same exploration as the one described in 1.1, but this time at α = 4%. Once more, describe step by step your analysis providing in each step the relevant results.
Give the interpretation of the regression coefficients of the selected model. Challenge the feasibility of the sign and magnitude of the coefficients in your model and if necessary try and propose an alternative.
Subject 1:
 The process starts with description of co linearity among independent variables. The independent variables and correlation between them can be depicted here:

WAGES 
KCAPITAL 
Labor 
D1 
D2 
WAGES 
1 




KCAPITAL 
0.905554 
1 



Labor 
0.564246 
0.250203 
1 


D1 
0.025988 
0.028247 
0.02952 
1 

D2 
0.028428 
0.02534 
0.073159 
0.06072 
1 
The highlighted correlation is greater then 0.8. Therefore, the variable has to be removed from the dataset and it can be said that the rest of the variables are not dangerously correlated. Regression analysis on the dependent variable and the rest three of the independent variable is given below:









Regression Statistics 








Multiple R 
0.82712 







R Square 
0.684128 







Adjusted R Square 
0.681428 







Standard Error 
17644.38 







Observations 
473 







ANOVA 









df 
SS 
MS 
F 
Significance F 



Regression 
4 
3.16E+11 
7.89E+10 
253.403 
1.2E115 



Residual 
468 
1.46E+11 
3.11E+08 





Total 
472 
4.61E+11 







Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 99.0% 
Upper 99.0% 
Intercept 
518.847 
1524.07 
0.34044 
0.733682 
3513.71 
2476.02 
4460.66 
3422.97 
X Variable 1 
0.74864 
0.026659 
28.08157 
2E102 
0.696253 
0.801027 
0.679689 
0.817591 
X Variable 2 
147.2564 
21.67842 
6.792765 
3.35E11 
104.6573 
189.8555 
91.1879 
203.325 
X Variable 3 
842.2054 
1694.082 
0.497145 
0.61932 
2486.74 
4171.155 
3539.33 
5223.738 
X Variable 4 
7993.062 
1896.699 
4.214195 
3.01E05 
4265.96 
11720.16 
3087.485 
12898.64 
It can be said from the table that the regression fit is good fit but the coefficient table shows that variable 3 has a pvalue higher then 0.01. Therefore, the variabl that is D1 has to deleted from the data table. Regression test with the same dependent variable and with those same independent variables other than D1 is given below:
Regression Statistics 








Multiple R 
0.827019333 







R Square 
0.683960977 







Adjusted R Square 
0.681939405 







Standard Error 
17630.21504 







Observations 
473 







ANOVA 









df 
SS 
MS 
F 
Significance F 



Regression 
3 
3.15E+11 
1.05E+11 
338.3313 
7E117 



Residual 
469 
1.46E+11 
3.11E+08 





Total 
472 
4.61E+11 







Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 99.0% 
Upper 99.0% 
Intercept 
13.95975517 
1082.726 
0.012893 
0.989719 
2113.63 
2141.554 
2786.35 
2814.271 
X Variable 1 
0.749167279 
0.026617 
28.14623 
8.4E103 
0.696864 
0.801471 
0.680326 
0.818008 
X Variable 2 
146.7921723 
21.64091 
6.783087 
3.56E11 
104.267 
189.3173 
90.82115 
202.7632 
X Variable 3 
8054.211397 
1891.187 
4.258812 
2.48E05 
4337.963 
11770.46 
3162.934 
12945.49 
It can be said from the table that the regression fit is quite good here and the pvalues of the coefficient falls under 0.01. The regression analysis can be interpreted as the ultimate model here with all the variables falling in line. Therefore, the required regression equation is :
Y= (0.75)*KCAPITAL + (146.79)*Labor + (8054.21)*D2.
 Coefficient of KCAPITAL is the average increase in the dependent variable with the per unit increase in KCAPITA with Labor keft fixed. Coefficient of Labor is the average increase in the dependent variable with the per unit increase in Labor keeping KCAPITAL fixed. D1 is categorical variable. Therefore, coefficient of D1 is the average change in y with every category of D1. The coefficient of KCAPITAl can be challenged here since it can be said that capital has a much larger effect in business. Again, the sign can be challenged here regarding Labor since a large number of Labor can have a negative impact. The coefficient can also be lowered regarding Labor. The model can be challenged in the lights of these arguments and a new model can be proposed like:
Y= (5)*KCAPITAL  (90)*Labor + (8054.21)*D2.
 Coefficient of determination is defined as the proportion of variation in the dependent variables that is being interpreted from independent variables. It can be interpreted here that 68% of variation in industrial production can be explained through Labor, KCAPITAL and D1.
Subject 2:
2.1 The process starts with description of co linearity among independent variables. The independent variables and correlation between them can be depicted here:

WAGES 
KCAPITAL 
Labor 
D1 
D2 
WAGES 
1 




KCAPITAL 
0.844151 
1 



Labor 
0.960251 
0.751036 
1 


D1 
0.027968 
0.03644 
0.004177 
1 

D2 
0.12812 
0.07081 
0.155761 
0.06072 
1 
The highlighted correlation is greater then 0.8. Therefore, the variable has to be removed from the dataset and it can be said that the rest of the variables are not dangerously correlated. Regression analysis on the dependent variable and the rest three of the independent variable is given below:










Regression Statistics 








Multiple R 
0.971118892 








R Square 
0.943071903 








Adjusted R Square 
0.942585338 








Standard Error 
0.131050435 








Observations 
473 








ANOVA 










df 
SS 
MS 
F 
Significance F 




Regression 
4 
133.1499 
33.28748 
1938.224 
1.2E289 




Residual 
468 
8.037533 
0.017174 






Total 
472 
141.1875 








Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 96.0% 
Upper 96.0% 

Intercept 
0.728946993 
0.045026 
16.18956 
3.92E47 
0.640469 
0.817425 
0.636217 
0.821677 

X Variable 1 
0.745283949 
0.017807 
41.85416 
2.6E160 
0.710293 
0.780275 
0.708611 
0.781957 

X Variable 2 
0.302633323 
0.020118 
15.04322 
5.29E42 
0.263101 
0.342165 
0.261201 
0.344065 

X Variable 3 
0.000311127 
0.012578 
0.024736 
0.980276 
0.0244 
0.025027 
0.02559 
0.026215 

X Variable 4 
0.291151384 
0.014821 
19.64427 
4.23E63 
0.262027 
0.320276 
0.260627 
0.321675 
It can be said from the table that the regression fit is good fit but the coefficient table shows that variable 3 has a pvalue higher then 0.01. Therefore, the variabl that is D1 has to deleted from the data table. Regression test with the same dependent variable and with those same independent variables other than D1 is given below:
Regression Statistics 








Multiple R 
0.971119 







R Square 
0.943072 







Adjusted R Square 
0.942708 







Standard Error 
0.130911 







Observations 
473 







ANOVA 









df 
SS 
MS 
F 
Significance F 



Regression 
3 
133.1499 
44.3833 
2589.817 
2.2E291 



Residual 
469 
8.037544 
0.017138 





Total 
472 
141.1875 







Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 96.0% 
Upper 96.0% 
Intercept 
0.729187 
0.043918 
16.60335 
4.95E49 
0.642887 
0.815488 
0.638739 
0.819636 
X Variable 1 
0.745264 
0.01777 
41.93906 
8.4E161 
0.710345 
0.780183 
0.708667 
0.781862 
X Variable 2 
0.302649 
0.020087 
15.06726 
4E42 
0.263178 
0.342119 
0.261281 
0.344016 
X Variable 3 
0.291168 
0.01479 
19.68681 
2.48E63 
0.262105 
0.320231 
0.260708 
0.321628 
It can be said from the table that the regression fit is quite good here and the pvalues of the coefficient falls under 0.01. The regression analysis can be interpreted as the ultimate model here with all the variables falling in line. Therefore, required regression equation is :
Y = 0.73 + 0.74*KCAPITAL + 0.30*Labor + 0.29*D2.
2.2. Coefficient of KCAPITAL is the average increase in the dependent variable with the per unit increase in KCAPITA with Labor keft fixed. Coefficient of Labor is the average increase in the dependent variable with the per unit increase in Labor keeping KCAPITAL fixed. D1 is categorical variable. Therefore, coefficient of D1 is the average change in y with every category of D1. The coefficient of KCAPITAl can be challenged here since it can be said that capital has a much larger effect in business. Again, the sign can be challenged here regarding Labor since a small number of Labor can have a negative impact. The coefficient can also be increased regarding Labor. The model can be challenged in the lights of these arguments and a new model can be proposed like:
Y= (5)*KCAPITAL  (90)*Labor + (0.29)*D1.
Subject 3.
It can be checked from the residual plot and the normality plot that the necessary assumptions of residual homoscadasticity and independence are not being met here regarding the log linear model but normality condition is being met. The normality and homoscadasticity is not being met in the linear model but the residuals are independent here.. The residual plot and normality plot is attached below:
Residual plot for the log linear model.
Normality plot for log linear model.
Residual plot for linear model.
Normality plot for linear model.
 The independent variable should be choosen here.
De Oliveira, A.B., Fischmeister, S., Diwan, A., Hauswirth, M. and Sweeney, P.F., 2017, March. Perphecy: Performance Regression Test Selection Made Simple but Effective. In Software Testing, Verification and Validation (ICST), 2017 IEEE International Conference on (pp. 103113). IEEE.
Saha, R.K., Zhang, L., Khurshid, S. and Perry, D.E., 2015, May. An information retrieval approach for regression test prioritization based on program changes. In Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on (Vol. 1, pp. 268279). IEEE.
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2020). Linear Regression Model Selection With Least Squares Method. Retrieved from https://myassignmenthelp.com/freesamples/mba60advancedquantitativemethodsformanagers1.
"Linear Regression Model Selection With Least Squares Method." My Assignment Help, 2020, https://myassignmenthelp.com/freesamples/mba60advancedquantitativemethodsformanagers1.
My Assignment Help (2020) Linear Regression Model Selection With Least Squares Method [Online]. Available from: https://myassignmenthelp.com/freesamples/mba60advancedquantitativemethodsformanagers1
[Accessed 16 June 2024].
My Assignment Help. 'Linear Regression Model Selection With Least Squares Method' (My Assignment Help, 2020) <https://myassignmenthelp.com/freesamples/mba60advancedquantitativemethodsformanagers1> accessed 16 June 2024.
My Assignment Help. Linear Regression Model Selection With Least Squares Method [Internet]. My Assignment Help. 2020 [cited 16 June 2024]. Available from: https://myassignmenthelp.com/freesamples/mba60advancedquantitativemethodsformanagers1.