a.What is the standard error of estimate? What does this statistic tell you?
b.What is the coefficient of determination? What does this statistic tell you?
c.What is the adjusted coefficient of determination for degree of freedom? What do this statistic and the one referred to in part (b) tell you about how well the model fits the data
d.Test the overall utility of the model. What does the test result tell you?
e.Interpret each of the coefficients.
f.Do these data allow the statistic practitioner to infer that the heights of the sons and the fathers are linearly related?)
g.Do these data allow the statistic practitioner to infer that the heights of the sons and the mothers are linearly related?
Question (1a)
Question (1a)
Questionnaire method of survey could be used for 100 sample of students because the method is economical to obtain quantitative data for analysis and predictions. The questionnaire method of survey is preferred for this study in order to reach the sample of 100 students easily within a short period of time, obtain first and effective answers based on individuals.
Question (1b)
Simple random sampling. The research targets students as the respondents, whose population are available (Aladag, 2015). Simple random sampling gives every respondent an equal opportunity to be chosen, therefore, tries to eliminate the bias. Due to this reasons, simple random sampling is preferred method to be used.
Question (1c)
Based on the given data, we have two variables namely; preparation time and marks. Our independent variable for this kind of data is preparation time while students’ marks is our dependent variable.
Student’s marks is a variable we are interested in and which is believed to be affected by the preparation time. According to this study, preparation time is a variable we believe to affect the marks. Based on the above reason, we conclude that students’ marks and preparation time is our dependent and independent variable respectively.
Both the preparation time and the students’ marks are quantitative data since the data is presented numerically.
Question (1d)
The method requires the population to be grouped (Saeed, 2017), therefore, doing so, requires time.
Our population sample is 100, which is not large enough to draw a better conclusion; this presents a great challenge.
Class interval 
Frequency 
Relative frequency 
Cumulative relative frequency 
2029 
1 
0.01 
0.01 
3039 
8 
0.08 
0.09 
4049 
16 
0.16 
0.25 
5059 
20 
0.20 
0.45 
6069 
20 
0.20 
0.65 
7079 
17 
0.17 
0.82 
8089 
12 
0.12 
0.94 
90100 
6 
0.06 
1 
Frequency Histogram for preparation of time
The frequency histogram for the preparation of time is displaying a normal distribution (Khan, 2015) with most of the data scatter around the mean. This reveals that the data follow a normal distribution
Relative Frequency Histogram for the preparation of time
The distribution of the data as per the relative frequency histogram above shows that the data is negatively skewed. The data is skewed towards to the left.
Cumulative Relative Frequency Histogram for the preparation of time
The cumulative relative frequency histogram increases as the class interval increases.
DISTRIBUTION TABLE FOR MARKS
Class interval 
Frequency 
relative frequency 
Cumulative frequency 
2029 
1 
0.01 
0.01 
3039 
5 
0.05 
0.06 
4049 
10 
0.10 
0.16 
5059 
17 
0.17 
0.33 
6069 
20 
0.20 
0.54 
7079 
22 
0.22 
0.76 
8089 
14 
0.14 
0.90 
90100 
10 
0.10 
1.00 
Frequency Histogram the Marks
The frequency histogram for the marks is nonsymmetrical (Stoykov, 2013) with most of the data skewed towards the left. This reveals that the data do not a follow the normal distribution.
Relative Frequency Histogram for the Marks
The distribution of the students’ marks as per the relative frequency histogram above shows that the students’ marks is negatively skewed. The data is skewed towards to the left.
Question (1b)
Cumulative Relative Frequency Histogram for the Marks
The cumulative relative frequency histogram increases as the class interval increases
Question (1f)
I identified the preparation time to be the cause of the students’ marks increase or decrease and therefore becomes our x axis. The students’ marks experience the effect, this becomes our Y axis.
Question (1g)
Regression Statistics 

Multiple R 
0.546556431 
R Square 
0.298723932 
Adjusted R Square 
0.291568054 
Standard Error 
14.65409389 
Observations 
100 
ANOVA 

Df 
SS 
MS 
F 
Significance F 

Regression 
1 
8964.478169 
8964.478169 
41.7452508 
4.04091E09 
Residual 
98 
21044.76183 
214.7424677 

Total 
99 
30009.24 
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 

Intercept 
28.98427749 
5.874519858 
4.933897271 
3.2994E06 
17.32648402 
40.64207096 
X Variable 1 
0.583053974 
0.090241275 
6.461056477 
4.04091E09 
0.403973101 
0.762134847 
Model Summary 

Model 
R 
R Square 
Adjusted R Square 
Std. Error of the Estimate 
1 
.547^{a} 
.299 
.292 
14.65409 
a. Predictors: (Constant), preparation time 
Y= 28.984 + 0.5831 * x
Marks = 28.984 + 0.5831 * Preparation Time in hours
An increase of Preparation time by one, results to change in marks as follows;
Marks = 28.984 + 0.5831 * 1
=28.984 + 0.5831
= 29.567
Question (1h)
Statistics 

Preparation time 
Marks 

N 
Valid 
100 
100 
Missing 
0 
0 

Mean 
63.0400 
65.7400 

Median 
64.0000 
68.0000 

Mode 
64.00 
70.00 

Std. Deviation 
16.32060 
17.41045 

Variance 
266.362 
303.124 

Range 
65.00 
75.00 

Minimum 
25.00 
25.00 

Maximum 
90.00 
100.00 

Percentiles 
25 
49.0000 
54.0000 
30 
54.0000 
58.0000 

50 
64.0000 
68.0000 

75 
76.7500 
78.0000 
Question (1i)
Correlation
Descriptive Statistics 

Mean 
Std. Deviation 
N 

Preparation time 
63.0400 
16.32060 
100 
marks 
65.7400 
17.41045 
100 
Correlations 

Preparation time 
marks 

Preparation time 
Pearson Correlation 
1 
.547^{**} 
Sig. (2tailed) 
.000 

Sum of Squares and Crossproducts 
26369.840 
15375.040 

Covariance 
266.362 
155.303 

N 
100 
100 

marks 
Pearson Correlation 
.547^{**} 
1 
Sig. (2tailed) 
.000 

Sum of Squares and Crossproducts 
15375.040 
30009.240 

Covariance 
155.303 
303.124 

N 
100 
100 

**. Correlation is significant at the 0.01 level (2tailed). 
The correlation of preparation time and marks is r =0.547
Therefore, preparation time and marks show a statistically significant relationship since the Pearson Correlation (r =0.547) is statistically significant (p< 0.0, for a 2 tailed test).
The direction of marks and preparation time is positive, meaning that both variables (dependent and independent move in the same direction. An increase of the preparation time by one unit will result in a corresponding increase of the marks (Stergiou, 2015). A decrease of the preparation time by one unit will result in a corresponding decrease of the marks by one unit.
We can, therefore, approximate the strength or the magnitude of the association by;
(.3 <  r  < .5)
Question (2a)
Standard error tells the extent to which sample mean deviates from the population mean. The small the standard error, the close the sample mean to the population mean
Question (2b)
Our coefficient of determination is 0.2672
It tells us that about 26.72% of the dependent variable (y) source of variation is being explained by the independent variable (x1 and x2).
Question (2c)
The adjusted R Square = 0.2635.
The adjusted R Square (0.2635) is the proportion of the overall variance that is explained by the model. Therefore, it is an improved version of R Square (0.2672) after the variables which are statistically insignificant have been removed (excluded in the model) (Suslov Mark Yur’evich, 2015).
Both the adjusted R Square and adjusted R Square measures the fitness of the model. The lower the values of the adjusted R Square, the better the fitness of the model.
Question (2d)
Ftest measures the utility of the model. Therefore, we postulate a hypothesis that
Ho: The intercept fitness of the model and the model are equal
H1: The intercept fitness of the model is reduced than the model
Question (1c)
The test will proceed as follows;
The significance F is 0.000 and the pvalue = 0.05. Since the pvalue is greater than the significance F, we do not reject the null hypothesis and conclude that the intercept fitness of the model and the model are equal.
Question (2e)
The coefficients presented in the output are Intercept = 93.8993, X1 = 0.4849, X2 = 0.0229
The intercept (pvalue = 0.0000) is less than 0.05; revealing that the intercept is statistically significant. The pvalue for the x1 variable is 0.0000 which is less than 0.05; revealing that the x1 variable is statistically significant.
The x2 variable is 0.5615which is greater than 0.05; revealing that the x2 variable is not statistically significant (Laber, 2017).
The coefficient implies that y variable is predicted by the change of the coefficient as follows
Y = 93.8993 + 0.4849 * x1 – 0.0229 * x2
The coefficients are interpreted as follows;
An increase in one unit of the x1 variable will result to the corresponding increase in y variable by 0.4849 units and an increase in one unit of the x2 variable will result to the corresponding decrease in y variable by 0.0229 units. The variable x2 is negatively correlated to y, as implied by the negative sign
Question (2f)
Yes, the heights of the sons and the fathers are linearly related. This is justified by the positive correlation as shown by the x1 intercept
Question (2g)
No, the heights of the sons and the mothers are not linearly related. This is justified by the negative correlation as shown by the x2 intercept.
References
Aladag, S. C. (2015). Improvement in Estimating the Population Median in Simple Random Sampling and Stratified Random Sampling Using Auxiliary Information. Communications in Statistics  Theory and Methods, 44(5), 7886.
Khan, M. F. (2015). Image contrast enhancement using normalized histogram equalization. Optik  International Journal for Light and Electron Optics, 126(24), 7281.
Laber, E. B. (2017). Statistical Significance and the Dichotomization of Evidence: The Relevance of the ASA Statement on Statistical Significance and pValues for Statisticians. Journal of the American Statistical Association, 112(519), 124134.
Saeed, A. D. (2017, 08 2125). Proceedings of the Conference of the ACM Special Interest Group on Data Communication  SIGCOMM '17  Carousel. ACM Press the Conference of the ACM Special Interest Group  Los Angeles, CA, USA (2017.08.212017.08.25, pp. 113. doi:10.1145/3098822.3098852
Stergiou, C. (2015). Explaining Correlations by Partitions. Foundations of Physics D, 45(12), 4556.
Stoykov, S. R. (2013). Nonlinear vibrations of beams with nonsymmetrical cross sections. International Journal of NonLinear Mechanics, 10(55), 112119.
Suslov Mark Yur’evich, T. I. (2015). Ordinary least squares and currency exchange rate. International scientific review, 2(3), 2233.
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2020). The Essay On Importance Of Questionnaire Method, Simple Random Sampling, And Data Analysis In Surveying And Analyzing Data.. Retrieved from https://myassignmenthelp.com/freesamples/hi6007groupassignmenteducation.
"The Essay On Importance Of Questionnaire Method, Simple Random Sampling, And Data Analysis In Surveying And Analyzing Data.." My Assignment Help, 2020, https://myassignmenthelp.com/freesamples/hi6007groupassignmenteducation.
My Assignment Help (2020) The Essay On Importance Of Questionnaire Method, Simple Random Sampling, And Data Analysis In Surveying And Analyzing Data. [Online]. Available from: https://myassignmenthelp.com/freesamples/hi6007groupassignmenteducation
[Accessed 30 May 2024].
My Assignment Help. 'The Essay On Importance Of Questionnaire Method, Simple Random Sampling, And Data Analysis In Surveying And Analyzing Data.' (My Assignment Help, 2020) <https://myassignmenthelp.com/freesamples/hi6007groupassignmenteducation> accessed 30 May 2024.
My Assignment Help. The Essay On Importance Of Questionnaire Method, Simple Random Sampling, And Data Analysis In Surveying And Analyzing Data. [Internet]. My Assignment Help. 2020 [cited 30 May 2024]. Available from: https://myassignmenthelp.com/freesamples/hi6007groupassignmenteducation.