1. A sample of 50 consumers has been provided and Consumer Inc needs to analyse the data on behalf of the client. One of the most imperative techniques to analyse the given sample data is through the use of descriptive statistics indicated below.
![]()
Income
The extent of dispersion in the data in limited only as reflected from the various measures (IQR, standard deviation, variance). Also, the symmetric shape for the data can be assumed as the positive skew can be approximated to zero. However, the central tendency measures do not coincide resulting in given variable being non-normal in distribution.
![]()
Household Size
The extent of dispersion in the data in quite high as reflected from the various measures (IQR, standard deviation, variance). Besides, the symmetric shape for the data cannot be assumed as the positive skew seems quite significant. However, the central tendency measures do not coincide resulting in given variable being non-normal in distribution.
![]()
Amount Charged
The extent of dispersion in the data in limited only as reflected from the various measures (IQR, standard deviation, variance). Also, the symmetric shape for the data can be assumed as the positive skew can be approximated to zero. However, the central tendency measures do not coincide resulting in given variable being non-normal in distribution.
2.
Using the Data Analysis function present in excel, linear regression models have been run shown below.
![]()
The critical parameter which reflects on the usefulness of the regression model with regards to predictability is R2 which for the given case has a value of 0.3979 which leads to the conclusion that the utility of the income is limited to explaining only 40% of amount charged variations.
![]()
The critical parameter which reflects on the usefulness of the regression model with regards to predictability is R2 which for the given case has a value of 0.5668 which leads to the conclusion that the utility of the household size is limited to explaining about 57% of amount charged variations.
It may be concluded that R2HS>R2I, thus establishing household size as the better independent variable for predicting charged amount on card.
3.
The multiple regression model unlike the above linear model would account for both the independent variables together as shown below.
![]()
The critical parameter which reflects on the usefulness of the regression model with regards to predictability is R2 which for the given case has a value of 0.8254 which leads to the conclusion that the utility of the household size & income in a joint manner is limited to explaining about 83% of amount charged variations.
However, considering the increase in R2 in comparison with the above models, the given model is definitely far better in terms of predictability. The various coefficients along with the model are also statistically significant as is evident from the excel output stated above.
4.
The multiple regression equation obtained above would be utilised with the requisite input values for income (40) and household size (3) as seen below.
![]()
5.
It cannot be denied that multiple regression model was an improvement over the linear models in terms of R2but further improvement is possible as theoretically the value can touch 1. Thus, there is a need to bring in more predictors which would better explain the movements of the credit card charges.
Some useful independent variables could include spending levels of customers, demographic attributes of customers along with the terms and conditions levied by the credit card issuing company.