Background
Briefly explain (using no more than 250 words in total for this Question 1)
(a) What type of survey method the researcher could use and why
(b) What sampling method could the researcher use to select his/her sample and why
(c) What are the two main variables the researcher should consider collecting data for the purpose of the above analysis and why? Identify the data type(s) for the variables.
(d) What kind of issues the researcher may face in this data collection
Suppose a researcher has collected data form a sample of 150 individuals using the sampling method you have proposed in (b). For each individual, the weekly take-home pay and weekly food expenditure were recorded. The data are stored in file FOODEXP.XLSX which is available in the “Assessment>>Assessment 2 - Assignment - Computer Application Project” in the unit website. Using this data set and EXCEL, answer the questions below.
QUESTION 2
First, the researcher is interested in presenting the data on weekly take-home pay and food expenditure graphically. For this purpose, the researcher categorised data on take-home pay nd food expenditure into 6 groups and calculated.
Using the data in the above frequency distribution tables and using EXCEL, answer the following questions.
(a) Which graphical technique or chart should be used if the researcher is interested in comparing the number of individuals in each weekly take-home pay category? Explain the reason for the selection of this graphical chart. Construct the chart and describe what you can observe about the number of individuals belong to each take-home pay category
(b) Which graphical technique or chart should be used if the researcher is interested in describing the proportion of the individuals in each food expenditure category Explain the reason for the selection of this graphical chart. Construct the chart and describe what you can observe about the proportion of individuals belong to each food expenditure category.
QUESTION 3
Second, researcher wishes to use graphical descriptive methods to present a summary of the data.
(a) The number of observations (N) is 150 individuals. The researcher suggests using 8 class intervals to construct a histogram for each variable. Explain how the researcher would have decided on the number of class intervals (K) as 8.
(b) The researcher suggests using class intervals as 100-225, 225-350, 350-475, ….., 975-1100 for Weekly take-home pay variable and class intervals 0-50, 50-100, 100- 150, …. , 350-400 for the Food expenditure variable. Explain how the researcher would have decided the width of the above class intervals (or class width).
(c) Draw a histogram for each variable using appropriate BIN values from part (b) and comment on the shape of the two distributions.
QUESTION 4
Third, the researcher wishes to use numerical descriptive measures to summarize the data.
(a) Prepare a numerical summary report for the two variables; weekly take-home pay and food expenditure by including summary measures such as mean, median, range, variance, standard deviation, smallest and largest values and the three quartiles, for each variable.
(b) Compute the correlation coefficient to measure the direction and strength of the linear relationship between the two variables. Interpret this value.
QUESTION 5
Finally, the researcher considers using regression analysis to establish a linear relationship between the two variables food expenditure and weekly take-home pay.
(a) What is the dependent variable and independent variable for this analysis Why
(b) Use an appropriate plot to investigate the relationship between the two variables. On the same plot, fit a linear trend line including the equation and the coefficient of determination R.
(c) Estimate a simple linear regression model and present the estimated linear equation.Display the regression summary table and interpret the intercept and slope coefficient estimates of the linear model.
(d) Find and interpret the value of the coefficient of determination, R-squared (R2 ).
- a) The data collection may be carried out using questionnaire method where questionnaires asking for their demographic data, income and expenses may be distributed among selected participants. The questionnaire method allows for a less laborious way to collect data from all over the nation without having to actually travel to the area, since the respondents fill out the form themselves and deliver it by prepaid mail or electronic medium for free.
(b) The study may be utilize a simple random sampling without replacement technique for the purposes of the selecting a sample of responses from among the responses. Simple random sampling technique is a probability sampling technique which assigns to each sampling unit an equal chance or probability of incidence of being included in the sample. The simplicity of the sampling process also makes it one of the most popular sampling techniques. Moreover if it is executed properly it gives minimum sampling bias as well. The fact that it gives equal chance to each population member ensures the generalizability of the sample as a representative of the overall population.
(c) The two main variables under study are, namely, the amount they have at their disposal to spend each week and the amount they spend of food. The data are both of interval scale and continuous type.
(d) A key issue with the process however could be of non-response to the questionnaires owing to lack of interest on part of the respondents towards the research. Another issue could be the failure of the respondent to disclose income and expenditure information correctly either intentionally or otherwise
Weekly take-home pay |
|
Weekly take-home pay category |
Frequency |
Take-home pay group 1 |
15 |
Take-home pay group 2 |
25 |
Take-home pay group 3 |
35 |
Take-home pay group 4 |
26 |
Take-home pay group 5 |
19 |
Take-home pay group 6 |
30 |
Table 1
The number of people in the several weekly take home pay categories can be represented using a frequency bar chart which constitutes of bars of length equal to the frequency corresponding to each of the six take home pay category. This is the simplest and most straight forward way of communicating to the layman about which category has the highest number of people and which one has lowest, that is, the heights of the bars. The categories can thus be compared on the basis of the heights or lengths of the bars depending on whether the chart is a vertical or horizontal bar chart.
Figure 1
Weekly Food Expenditure |
|
Weekly Food Expenditure category |
Frequency |
Food expenditure group 1 |
18 |
Food expenditure group 2 |
28 |
Food expenditure group 3 |
31 |
Food expenditure group 4 |
29 |
Food expenditure group 5 |
24 |
Food expenditure group 6 |
20 |
Table 2
The proportion of individuals in each weekly food expenditure category can be represented using a pie chart. A pie chart is typically used to represent graphically, proportions or percentages that portions of the whole constitute. Each section of the circle represents the percentage that a category occupies of the total number of people in all categories. It is easily interpreted by any laymen to whom the situation is desired to be communicated.
The following figure shows the proportion of people in the weekly food expense categories.
The 2k rule dictates that the number of class intervals appropriate for grouping a set of ungrouped data is given by k, where the number of observations, N is less than 2k for a particular k. Now taking k=7, 2k is 128 and 254 when k=8. Then N=150 is less than 254 when k=8 but it is greater than 128. So the number of intervals to be considered in 8.
- The width of each interval is given by the ratio of the difference of the lower limit of first class interval and the upper limit of last class interval of the data to the number of class intervals. Then the difference is 1100 minus 100 which is 1000. Finally the class width is the rounded off value of 1000/8 which is 125.
Sampling Technique
Similarly for weekly food expenditure the upper limit of last class interval was f to be 400 and the lower limit of first class interval was 0. The difference is therefore 400. Then since the number of class intervals is 8, the width is 400/8 which is 50.
(c) The frequency table and the histogram for the weekly take home pay as per the class intervals described in part (b) are given as follows:
Frequency Table: Weekly Take Home Pay |
||
Take Home Pay |
Frequency |
Cumulative % |
|
||
100-225 |
15 |
10.00% |
225-350 |
25 |
26.67% |
350-475 |
35 |
50.00% |
475-600 |
26 |
67.33% |
600-725 |
19 |
80.00% |
725-850 |
16 |
90.67% |
850-975 |
9 |
96.67% |
975-1100 |
5 |
100.00% |
|
Figure 3
The frequency table and the histogram for the weekly food expenditure as per the class intervals described in part (b) are given as follows:
Frequency Table: Weekly Food Expenditure |
||
Food Expenditure |
Frequency |
Cumulative % |
0-50 |
3 |
2.00% |
50-100 |
16 |
12.67% |
100-150 |
29 |
32.00% |
150-200 |
30 |
52.00% |
200-250 |
27 |
70.00% |
250-300 |
25 |
86.67% |
300-350 |
15 |
96.67% |
350-400 |
5 |
100.00% |
|
Table 4
The average weekly take home pay was found to be $50.58 for a typical participant. The variance was 56481.72 and the standard deviation was 237.65 which gives the extent of spread of the data around the mean value. Fifty percent of the people earned at least $465 as suggested by the median value. Seventy five percent of the people earn at least $313 per week on an average as suggested from the first quartile value and twenty five percent people at least earn $680.5 per week on an average as suggested from the third quartile. The minimum weekly take home pay was found to be $105 and the maximum take home pay per week was $1090.
Numerical Summary: Weekly Take-home pay |
|
|
|
Mean |
501.5867 |
Median(Q2) |
465 |
Minimum |
105 |
Maximum |
1090 |
Q1 |
313 |
Q3 |
680.5 |
Range |
985 |
Variance |
56481.72 |
Standard Deviation |
237.6588 |
|
Table 5
The average weekly expenditure of food was found to be $197.99 for a typical participant. The variance was 6848.11 and the standard deviation was 82.75 which gives the extent of spread of the data around the mean value. Fifty percent of the people spent at least $190.87 on food as suggested by the median value. Seventy five percent of the people spent at least $128.73 per week on an average on food as suggested from the first quartile value and twenty five percent people at least spent on food an amount of $261.01 per week on an average as suggested from the third quartile. The minimum weekly expenditure on food was found to be $44.31 and the maximum take home pay per week was $373.
Numerical Summary: Weekly food expenditure |
|
|
|
Mean |
197.9913 |
Median(Q2) |
190.8705 |
Minimum |
44.3137 |
Maximum |
373 |
Q1 |
128.7365 |
Q3 |
261.0111 |
Range |
329.1642 |
Variance |
6848.111 |
Standard Deviation |
82.75331 |
|
Table 6
The relationship between the weekly take home pay and the weekly expenditure on food is quantified using the correlation coefficient and the value of the correlation coefficient was found to be 0.899. The two variables thus have strong and positive linear relationship between each other.
Q5.
- The dependent variable or the response variable here is weekly food expenditure of a person and the independent variable is the weekly take home pay. The scenario seeks to establish a relationship between weekly food expenditure and the weekly take home pay. Naturally, the expenditure is dependent on the earnings that is a person would spend as per how much he earns in a week. The expenditure on food should also thus be influenced by the amount earned.
- The following scatter diagram depicts the relationship between the weekly expense on food by a person and his weekly take home pay, that is the amount of earnings he gets at hand per week. The plot shows the food expenditure against the take home pay and the scatter points are seen to have a linear relationship overall. The expenditure seems to increase as the take home pay increases and vice versa. The following chart shows the scatter plot, trend line, trend line equation and R2
- The trend line in the chart has a positive slope. The equation for the trend line as depicted on the chart is: y = 0.3133x + 40.859. The value of R2 or the coefficient of correlation was also found to be 0.8094 that reflects strong positive correlation between the response and covariate. The following table gives the regression statistics for the linear regression line that was fitted to the data.
It is seen from the over regression significance measure given in the table below that the overall model that was fitted significantly explains the variation in the dependent variable, weekly food expenditure at 0.05 level of significance.
Regression: ANOVA |
|||||
|
df |
SS |
MS |
F |
Significance F |
Regression |
1 |
825915.5 |
825915.5 |
628.612 |
3.85E-55 |
Residual |
148 |
194453 |
1313.872 |
||
Total |
149 |
1020368 |
Table 7
The following table gives the regression parameter estimates of the fitted regression model. It is seen that the covariate, weekly take home pay has a significant positive effect on the dependent variable, that is, food expenditure at 5% level of significance. The food expenditure per week increases as per 0.313 units with unit increase in week take home pay. The 95% of the confidence interval is between 0.288 and 0.337. This implies that the chance that the interval specified has a probability 0.95 of containing the value of the effect that weekly take home pay has on the weekly expenditure of food by a subject of the population on whom the study is being conducted. The estimate of the intercept was found to be 40.858. This means that if the weekly take home pay were zero then this is the amount that a person would expend on food per week.
Regression Parameter Estimates |
||||||
|
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
Intercept |
40.8585244 |
6.930892 |
5.895132 |
2.43E-08 |
27.16223 |
54.55482 |
Take-home pay |
0.313271369 |
0.012495 |
25.07214 |
3.85E-55 |
0.28858 |
0.337963 |
Table 8
- The coefficient of determination, or R2 statistic was found to be 0.809. This statistic gives the ratio between the variation explained by the regression model and the total variation. It is a measure of the goodness of fit of the model and indicates how well the model has been able to capture the variation in the response variable, which is about 80.94% in this case as suggested from the value of R squared value. The model thus captures a good portion of the variation.
Regression Statistics |
|
Multiple R |
0.89968252 |
R Square |
0.809428637 |
Adjusted R Square |
0.808140992 |
Standard Error |
36.24736811 |
Observations |
150 |
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2021). Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure. Retrieved from https://myassignmenthelp.com/free-samples/afe135-business-data-analysis/form-themselves.html.
"Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure." My Assignment Help, 2021, https://myassignmenthelp.com/free-samples/afe135-business-data-analysis/form-themselves.html.
My Assignment Help (2021) Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure [Online]. Available from: https://myassignmenthelp.com/free-samples/afe135-business-data-analysis/form-themselves.html
[Accessed 14 November 2024].
My Assignment Help. 'Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure' (My Assignment Help, 2021) <https://myassignmenthelp.com/free-samples/afe135-business-data-analysis/form-themselves.html> accessed 14 November 2024.
My Assignment Help. Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure [Internet]. My Assignment Help. 2021 [cited 14 November 2024]. Available from: https://myassignmenthelp.com/free-samples/afe135-business-data-analysis/form-themselves.html.