Background
Briefly explain (using no more than 250 words in total for this Question 1)
(a) What type of survey method the researcher could use and why
(b) What sampling method could the researcher use to select his/her sample and why
(c) What are the two main variables the researcher should consider collecting data for the purpose of the above analysis and why? Identify the data type(s) for the variables.
(d) What kind of issues the researcher may face in this data collection
Suppose a researcher has collected data form a sample of 150 individuals using the sampling method you have proposed in (b). For each individual, the weekly takehome pay and weekly food expenditure were recorded. The data are stored in file FOODEXP.XLSX which is available in the “Assessment>>Assessment 2  Assignment  Computer Application Project” in the unit website. Using this data set and EXCEL, answer the questions below.
QUESTION 2
First, the researcher is interested in presenting the data on weekly takehome pay and food expenditure graphically. For this purpose, the researcher categorised data on takehome pay nd food expenditure into 6 groups and calculated.
Using the data in the above frequency distribution tables and using EXCEL, answer the following questions.
(a) Which graphical technique or chart should be used if the researcher is interested in comparing the number of individuals in each weekly takehome pay category? Explain the reason for the selection of this graphical chart. Construct the chart and describe what you can observe about the number of individuals belong to each takehome pay category
(b) Which graphical technique or chart should be used if the researcher is interested in describing the proportion of the individuals in each food expenditure category Explain the reason for the selection of this graphical chart. Construct the chart and describe what you can observe about the proportion of individuals belong to each food expenditure category.
QUESTION 3
Second, researcher wishes to use graphical descriptive methods to present a summary of the data.
(a) The number of observations (N) is 150 individuals. The researcher suggests using 8 class intervals to construct a histogram for each variable. Explain how the researcher would have decided on the number of class intervals (K) as 8.
(b) The researcher suggests using class intervals as 100225, 225350, 350475, ….., 9751100 for Weekly takehome pay variable and class intervals 050, 50100, 100 150, …. , 350400 for the Food expenditure variable. Explain how the researcher would have decided the width of the above class intervals (or class width).
(c) Draw a histogram for each variable using appropriate BIN values from part (b) and comment on the shape of the two distributions.
QUESTION 4
Third, the researcher wishes to use numerical descriptive measures to summarize the data.
(a) Prepare a numerical summary report for the two variables; weekly takehome pay and food expenditure by including summary measures such as mean, median, range, variance, standard deviation, smallest and largest values and the three quartiles, for each variable.
(b) Compute the correlation coefficient to measure the direction and strength of the linear relationship between the two variables. Interpret this value.
QUESTION 5
Finally, the researcher considers using regression analysis to establish a linear relationship between the two variables food expenditure and weekly takehome pay.
(a) What is the dependent variable and independent variable for this analysis Why
(b) Use an appropriate plot to investigate the relationship between the two variables. On the same plot, fit a linear trend line including the equation and the coefficient of determination R.
(c) Estimate a simple linear regression model and present the estimated linear equation.Display the regression summary table and interpret the intercept and slope coefficient estimates of the linear model.
(d) Find and interpret the value of the coefficient of determination, Rsquared (R2 ).
 a) The data collection may be carried out using questionnaire method where questionnaires asking for their demographic data, income and expenses may be distributed among selected participants. The questionnaire method allows for a less laborious way to collect data from all over the nation without having to actually travel to the area, since the respondents fill out the form themselves and deliver it by prepaid mail or electronic medium for free.
(b) The study may be utilize a simple random sampling without replacement technique for the purposes of the selecting a sample of responses from among the responses. Simple random sampling technique is a probability sampling technique which assigns to each sampling unit an equal chance or probability of incidence of being included in the sample. The simplicity of the sampling process also makes it one of the most popular sampling techniques. Moreover if it is executed properly it gives minimum sampling bias as well. The fact that it gives equal chance to each population member ensures the generalizability of the sample as a representative of the overall population.
(c) The two main variables under study are, namely, the amount they have at their disposal to spend each week and the amount they spend of food. The data are both of interval scale and continuous type.
(d) A key issue with the process however could be of nonresponse to the questionnaires owing to lack of interest on part of the respondents towards the research. Another issue could be the failure of the respondent to disclose income and expenditure information correctly either intentionally or otherwise
Weekly takehome pay 

Weekly takehome pay category 
Frequency 
Takehome pay group 1 
15 
Takehome pay group 2 
25 
Takehome pay group 3 
35 
Takehome pay group 4 
26 
Takehome pay group 5 
19 
Takehome pay group 6 
30 
Table 1
The number of people in the several weekly take home pay categories can be represented using a frequency bar chart which constitutes of bars of length equal to the frequency corresponding to each of the six take home pay category. This is the simplest and most straight forward way of communicating to the layman about which category has the highest number of people and which one has lowest, that is, the heights of the bars. The categories can thus be compared on the basis of the heights or lengths of the bars depending on whether the chart is a vertical or horizontal bar chart.
Figure 1
Weekly Food Expenditure 

Weekly Food Expenditure category 
Frequency 
Food expenditure group 1 
18 
Food expenditure group 2 
28 
Food expenditure group 3 
31 
Food expenditure group 4 
29 
Food expenditure group 5 
24 
Food expenditure group 6 
20 
Table 2
The proportion of individuals in each weekly food expenditure category can be represented using a pie chart. A pie chart is typically used to represent graphically, proportions or percentages that portions of the whole constitute. Each section of the circle represents the percentage that a category occupies of the total number of people in all categories. It is easily interpreted by any laymen to whom the situation is desired to be communicated.
The following figure shows the proportion of people in the weekly food expense categories.
The 2^{k }rule dictates that the number of class intervals appropriate for grouping a set of ungrouped data is given by k, where the number of observations, N is less than 2^{k }for a particular k. Now taking k=7, 2^{k} is 128 and 254 when k=8. Then N=150 is less than 254 when k=8 but it is greater than 128. So the number of intervals to be considered in 8.
 The width of each interval is given by the ratio of the difference of the lower limit of first class interval and the upper limit of last class interval of the data to the number of class intervals. Then the difference is 1100 minus 100 which is 1000. Finally the class width is the rounded off value of 1000/8 which is 125.
Sampling Technique
Similarly for weekly food expenditure the upper limit of last class interval was f to be 400 and the lower limit of first class interval was 0. The difference is therefore 400. Then since the number of class intervals is 8, the width is 400/8 which is 50.
(c) The frequency table and the histogram for the weekly take home pay as per the class intervals described in part (b) are given as follows:
Frequency Table: Weekly Take Home Pay 

Take Home Pay 
Frequency 
Cumulative % 


100225 
15 
10.00% 
225350 
25 
26.67% 
350475 
35 
50.00% 
475600 
26 
67.33% 
600725 
19 
80.00% 
725850 
16 
90.67% 
850975 
9 
96.67% 
9751100 
5 
100.00% 

Figure 3
The frequency table and the histogram for the weekly food expenditure as per the class intervals described in part (b) are given as follows:
Frequency Table: Weekly Food Expenditure 

Food Expenditure 
Frequency 
Cumulative % 
050 
3 
2.00% 
50100 
16 
12.67% 
100150 
29 
32.00% 
150200 
30 
52.00% 
200250 
27 
70.00% 
250300 
25 
86.67% 
300350 
15 
96.67% 
350400 
5 
100.00% 

Table 4
The average weekly take home pay was found to be $50.58 for a typical participant. The variance was 56481.72 and the standard deviation was 237.65 which gives the extent of spread of the data around the mean value. Fifty percent of the people earned at least $465 as suggested by the median value. Seventy five percent of the people earn at least $313 per week on an average as suggested from the first quartile value and twenty five percent people at least earn $680.5 per week on an average as suggested from the third quartile. The minimum weekly take home pay was found to be $105 and the maximum take home pay per week was $1090.
Numerical Summary: Weekly Takehome pay 



Mean 
501.5867 
Median(Q2) 
465 
Minimum 
105 
Maximum 
1090 
Q1 
313 
Q3 
680.5 
Range 
985 
Variance 
56481.72 
Standard Deviation 
237.6588 

Table 5
The average weekly expenditure of food was found to be $197.99 for a typical participant. The variance was 6848.11 and the standard deviation was 82.75 which gives the extent of spread of the data around the mean value. Fifty percent of the people spent at least $190.87 on food as suggested by the median value. Seventy five percent of the people spent at least $128.73 per week on an average on food as suggested from the first quartile value and twenty five percent people at least spent on food an amount of $261.01 per week on an average as suggested from the third quartile. The minimum weekly expenditure on food was found to be $44.31 and the maximum take home pay per week was $373.
Numerical Summary: Weekly food expenditure 



Mean 
197.9913 
Median(Q2) 
190.8705 
Minimum 
44.3137 
Maximum 
373 
Q1 
128.7365 
Q3 
261.0111 
Range 
329.1642 
Variance 
6848.111 
Standard Deviation 
82.75331 

Table 6
The relationship between the weekly take home pay and the weekly expenditure on food is quantified using the correlation coefficient and the value of the correlation coefficient was found to be 0.899. The two variables thus have strong and positive linear relationship between each other.
Q5.
 The dependent variable or the response variable here is weekly food expenditure of a person and the independent variable is the weekly take home pay. The scenario seeks to establish a relationship between weekly food expenditure and the weekly take home pay. Naturally, the expenditure is dependent on the earnings that is a person would spend as per how much he earns in a week. The expenditure on food should also thus be influenced by the amount earned.
 The following scatter diagram depicts the relationship between the weekly expense on food by a person and his weekly take home pay, that is the amount of earnings he gets at hand per week. The plot shows the food expenditure against the take home pay and the scatter points are seen to have a linear relationship overall. The expenditure seems to increase as the take home pay increases and vice versa. The following chart shows the scatter plot, trend line, trend line equation and R^{2}
 The trend line in the chart has a positive slope. The equation for the trend line as depicted on the chart is: y = 0.3133x + 40.859. The value of R^{2 }or the coefficient of correlation was also found to be 0.8094 that reflects strong positive correlation between the response and covariate. The following table gives the regression statistics for the linear regression line that was fitted to the data.
It is seen from the over regression significance measure given in the table below that the overall model that was fitted significantly explains the variation in the dependent variable, weekly food expenditure at 0.05 level of significance.
Regression: ANOVA 


df 
SS 
MS 
F 
Significance F 
Regression 
1 
825915.5 
825915.5 
628.612 
3.85E55 
Residual 
148 
194453 
1313.872 

Total 
149 
1020368 
Table 7
The following table gives the regression parameter estimates of the fitted regression model. It is seen that the covariate, weekly take home pay has a significant positive effect on the dependent variable, that is, food expenditure at 5% level of significance. The food expenditure per week increases as per 0.313 units with unit increase in week take home pay. The 95% of the confidence interval is between 0.288 and 0.337. This implies that the chance that the interval specified has a probability 0.95 of containing the value of the effect that weekly take home pay has on the weekly expenditure of food by a subject of the population on whom the study is being conducted. The estimate of the intercept was found to be 40.858. This means that if the weekly take home pay were zero then this is the amount that a person would expend on food per week.
Regression Parameter Estimates 


Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Intercept 
40.8585244 
6.930892 
5.895132 
2.43E08 
27.16223 
54.55482 
Takehome pay 
0.313271369 
0.012495 
25.07214 
3.85E55 
0.28858 
0.337963 
Table 8
 The coefficient of determination, or R^{2} statistic was found to be 0.809. This statistic gives the ratio between the variation explained by the regression model and the total variation. It is a measure of the goodness of fit of the model and indicates how well the model has been able to capture the variation in the response variable, which is about 80.94% in this case as suggested from the value of R squared value. The model thus captures a good portion of the variation.
Regression Statistics 

Multiple R 
0.89968252 
R Square 
0.809428637 
Adjusted R Square 
0.808140992 
Standard Error 
36.24736811 
Observations 
150 
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2021). Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure. Retrieved from https://myassignmenthelp.com/freesamples/afe135businessdataanalysis/formthemselves.html.
"Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure." My Assignment Help, 2021, https://myassignmenthelp.com/freesamples/afe135businessdataanalysis/formthemselves.html.
My Assignment Help (2021) Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure [Online]. Available from: https://myassignmenthelp.com/freesamples/afe135businessdataanalysis/formthemselves.html
[Accessed 18 May 2024].
My Assignment Help. 'Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure' (My Assignment Help, 2021) <https://myassignmenthelp.com/freesamples/afe135businessdataanalysis/formthemselves.html> accessed 18 May 2024.
My Assignment Help. Sampling And Statistical Analysis For Weekly Take Home Pay And Food Expenditure [Internet]. My Assignment Help. 2021 [cited 18 May 2024]. Available from: https://myassignmenthelp.com/freesamples/afe135businessdataanalysis/formthemselves.html.