The descriptive statistics visible reflect the non-normal distribution of income based on the fact that the central tendency measures have a non-converging value. Also, there is deviation in Kurtosis from the accepted value of 3 observed for normal distributions. Moreover, The Symmetrical Distribution would be observed considering the skew being so minute and thus can be viewed as zero. Further, a low dispersion is viewed for the concerned variables based on the descriptive statistics above.
Household Size
The descriptive statistics visible reflect the non-normal distribution of household size based on the fact that the central tendency measures have a non-converging value. Also, there is deviation in Kurtosis from the accepted value of 3 observed for normal distributions. Moreover, a non-symmetrical distribution would be observed considering the skew being significant and thus cannot be viewed as zero. Further, a high dispersion is viewed for the concerned variables based on the descriptive statistics above.
Amount Charged
The descriptive statistics visible reflect the non-normal distribution of amount charged based on the fact that the central tendency measures have a non-converging value. Also, there is deviation in Kurtosis from the accepted value of 3 observed for normal distributions. Moreover, the symmetrical distribution would be observed considering the skew being so minute and thus can be viewed as zero. Further, a low dispersion is viewed for the concerned variables based on the descriptive statistics above.
Output of Linear Regression
Model 2 (R2) = 0.5668
From the above result, it would be fair to conclude that model 2 yields a higher R2 value in comparison to that of model 1, thus indicating that model 2 is far superior thus leading to the conclusion of household size being the better estimator.
Output of Multiple Regression Analysis
Model 3(R2) = 0.8254
There is a great deal of improvement in the R2 value of the given model which signifies the better predictive power associated with the model in comparison to the predecessors. This is also reflected from the excel output which highlight that the regression model is significant through the F test results.
In model 3, amount charged can be predicted by plugging in the values of the two independent variables i.e. household size and income. This is demonstrated below.
There has been an increase with the addition of independent variable as has been seen model 3 but there is a need to ramp up the R2 value further so as it attains the theoretical maximum. This would require that certain more variables which act as independent must be inserted in the model lie some of those mentioned following.
- Dummy variable for gender
- Variable such as age
- Interest rate charged by underlying the credit card
- Economic growth that is witnessed as it impacts the spending pattern
Brief discussion about the correlations is given below:
Point for discussion and assumptions:
- “If sign of correlation coefficient of the variables is higher than 0.5 than it would be said that the correlation is strong or else if the value is lower than 0.5 then the correlation is weak.
- If sign of correlation coefficient is positive then variables would be categorised as positively correlated or else if sign is negative then the variables are negatively correlated to each other.
- Level of significance is 5%.
- Degree of freedom is 96.
- If significance value of the respective correlation is higher than level of significance (alpha) then the correlation would be termed as insignificant or else if the value is lesser than (alpha) then the correlation would be significant.
- Correlation coefficient, t value and significance value is computed in excel and would be discuss here.”
“The value of correlation coefficient is 0.562. The variables are representing positive correlation. The strength of correlation is strong. The t value is 12.549, for which the significance value comes around to be 0.00. This indicates that the correlation between the respective variables is significant because the significance value < level of significance.”
“The value of correlation coefficient is 0.121. The variables are representing positive correlation. The strength of correlation is weak. The t value is 1.350, for which the significance value comes around to be 0.180. This indicates that the correlation between the respective variables is non-significant because the significance value > level of significance.”
“The value of correlation coefficient is 0.549. The variables are representing positive correlation. The strength of correlation is strong. The t value is 11.926, for which the significance value comes around to be 0.00. This indicates that the correlation between the respective variables is significant because the significance value < level of significance.”
“The value of correlation coefficient is 0.120. The variables are representing positive correlation. The strength of correlation is weak. The t value is 1.337, for which the significance value comes around to be 0.184. This indicates that the correlation between the respective variables is non-significant because the significance value > level of significance.”
“The value of correlation coefficient is -0.131. The variables are representing negative correlation. The strength of correlation is weak. The t value is -1.479, for which the significance value comes around to be 0.142. This indicates that the correlation between the respective variables is non-significant because the significance value > level of significance.”
“The value of correlation coefficient is -0.004. The variables are representing negative correlation. The strength of correlation is weak. The t value is -0.040, for which the significance value comes around to be 0.968. This indicates that the correlation between the respective variables is non-significant because the significance value > level of significance.”
“The value of correlation coefficient is 0.101. The variables are representing positive correlation. The strength of correlation is weak. The t value is 1.095, for which the significance value comes around to be 0.276. This indicates that the correlation between the respective variables is non-significant because the significance value > level of significance.”
“The value of correlation coefficient is -0.038. The variables are representing negative correlation. The strength of correlation is weak. The t value is -0.384, for which the significance value comes around to be 0.702. This indicates that the correlation between the respective variables is non-significant because the significance value > level of significance.”
“The value of correlation coefficient is -0.155. The variables are representing negative correlation. The strength of correlation is weak. The t value is -1.802, for which the significance value comes around to be 0.075. This indicates that the correlation between the respective variables is non-significant because the significance value > level of significance.”
“The value of correlation coefficient is 0.014. The variables are representing positive correlation. The strength of correlation is weak. The t value is 0.140, for which the significance value comes around to be 0.889. This indicates that the correlation between the respective variables is non-significant because the significance value > level of significance.”
The key observation is the unequal mean scores which are reflective of the location being a potent factor in relation to mental depression for the underlying sample.
The key observation is the equal mean scores which are reflective of the location not being a potent factor in relation to mental depression for the underlying sample.
Ho: For a group of healthy participants selected from the given three cities, there would be no significant difference observable in the mean depression score across cities.
H1: For a group of healthy participants selected from the given three cities, there would be significant difference observable in the mean depression score across cities.
F stat calculated exceeds Fcrit , hence null hypothesis rejected and alternative hypothesis accepted.
Ho: For a group of aged and ill participants selected from the given three cities, there would be no significant difference observable in the mean depression score across cities.
H1: For a group of aged and ill participants selected from the given three cities, there would be significant difference observable in the mean depression score across cities.
F stat calculated does not exceed Fcrit, hence null hypothesis not rejected and alternative hypothesis not accepted.
It is apparent from the above testing that the following conclusion can be drawn.
For people those who are young and healthy, the location tends to act as a significant variable impacting the mental depression incidence. This is on expected lines as in wake of other concerns, the external factors can be important.
For people those who are old and sick, the location does not tend to act as a significant variable impacting the mental depression incidence. This is on expected lines as with such people, there personal concerns take a toll on their mental and emotional health and thus the effect of location is overshadowed by these factors.