(a) Given, µ = population mean = 11.6 and σ = population standard deviation = 14.8.
Here, x? = µ and s = (σ/√n) since population follows normal distribution. Therefore,
Sample mean or x?=11.6 and
Population standard deviation = σ = 14.8.
(b) 95% z - score value of normal distribution is 1.96 when n is large. Again, 90% z - score value of normal distribution is 1.644 with large n (Ross 2014). Interval gets wider with the increase in confidence level. Therefore, 90% confidence interval in narrower from 95% confidence interval.
(c) Let n denote the required sample size. Then,
n = [(zα/2)2 * σ * (1- σ)]/(margin of error)2.
Where, zα/2 = z – score at α/2, σ = population standard deviation and margin of error is the permitted error level. Given, σ = 14.8, margin of error is 0.01 and 99% Z – score is 2.576.
Therefore, n = mod [{2.5762 * 14.8 * (1-14.8)}/ (0.01)2] = [6.635776*(-13.8)*14.8]/(0.00001)
=13552909.
The required sample size is 13552909.
The statement is not correct.
In case of normal distribution, 95% confidence interval of µ is given by {µ ± zα/2* (σ/√n)}. Standard error multiplied with confidence co-efficient is to be added or subtracted here rather than only the standard deviation (Leon-Garcia 2017). Therefore, we cannot claim with 95% probability that µ will fall in between 11.6 and 14.8.
The statement is not correct. We can say with confidence that 68% of the observation will fall in the interval of 11.6 ± 14.8.
A normal curve is a bell shaped curve with its mean as the central line of it (Grimmett 2018). The curve has half of its area in the right tail of the curve and the rest of the half in the left tail. Standard deviation denotes the range of dispersion from mean. It has been seen that 68% of the value falls within µ ± σ i.e. P[µ- σ<X<µ+ σ] ≈ 0.068.
(f) Let µ1 be the calculated mean and µ be the hypothesized mean . We are to test:
H0: µ = 12.5 vs H1: µ > 12.5.
Testing statistic is : t = (µ1 - µ)/(σ/√n)
So, calculated t is (14.5 – 12.5)/(14.8/√30) = 0.740 and tabulated t is 1.96.
Therefore, calculated t < tabulated t and null hypothesis is accepted and it can be said that the average alcohol content is 12.5%.
Unbiased estimate of average customer satisfaction = x?1 = 54.06
Unbiased estimate of variance of customer satisfaction = s1 = 45.30
Unbiased estimate of average customer satisfaction = x?2 = 45.30
Unbiased estimate of variance of customer satisfaction = s2 = 39.06
The assumption is both the populations follow normal distribution (Lyons and Peres 2016).
Test hypothesis: H0 : σ1 = σ2 vs H1: σ1 ≠ σ2 , where σ1 is population standard deviation of region1 and σ2 is population standard deviation of region2.
Test statistics: F = s1/ s2, where s1 is standard deviation of region 1 and s2 is standard deviation for region 2.
F ~ F (15, 12)
Calculation results are:
F statistic = 1.1317.
F – Critical value = 2.6371.
It can be seen that tabulated F > calculated F. Therefore the null hypothesis is rejected (Aidara 2018.). We cannot conclude at 5% significance that the variances are equal.
Test hypothesis: H0: µ1 < µ2 vs. H1: µ1 >= µ2, where µ1 and µ2 are hypothesized mean of region 1 and region 2 respectively.
Test statistics: (x?1 - x?2)/ [s/{√(1/N1) +(1/N2 )}] ~ t27 , where x?1 and x?2 are mean of the given dataset for region 1 and region 2 respectively.
Calculation results are:
T – stat = 3.5901.
P(T<=t) = 0.0006
T Critical value = 2.47
It can be seen that tabulated t < calculated t. Therefore, the null hypothesis can be accepted and it can be said that average satisfaction of region 2 is not significantly less than region 1 at level of significance 1.
Let p denote the proportion of people satisfied from the services (Allen 2014).
Test hypothesis: H0 : p0 = 0.3 vs H1 : p0 < 0.3 .
Test statistic : k = (p – p0)/ [√{p0*(1-p0)}/n], where p0 is the hypothesized proportion and n is total sample size.
Calculation results are:
Test statistic = 0.117
P-value = 0.4538
It can be seen that p-value < 0.05. Therefore, null hypothesis will be rejected and it can be said that the proportion has decreased and less than 30% are satisfied in the present scenario with level of significance 5%.
(i). With the assumption of homoscadaticity, the standard error for the differences in mean is given by: √{(s1/n1) + (s2/n2)}. The sample size is inversely proportional to sample size. Therefore, as sample size increases, standard error decreases.
(ii). In a two sample mean test , with assumptions of homoscadasticity and same sample size, critical value of student t test can be larger than the critical value of approximate test on CLT. Reason for this is that degree of freedom has to be taken into consideration for calculation of t statistic that is sample size has to be taken into account. Whereas, no sample size has to be taken into account for computing test based on central limit theorem. This factor creates the difference.
References:
Ross, S.M., 2014. Introduction to probability models. Academic press.
Dudley, R.M., 2018. Real analysis and probability. CRC Press.
Leon-Garcia, A., 2017. Probability, statistics, and random processes for electrical engineering.
Aidara, N., 2018. Introduction to probability and statistics.
Lyons, R. and Peres, Y., 2016. Probability on trees and networks (Vol. 42). Cambridge University Press.
Allen, A.O., 2014. Probability, statistics, and queueing theory. Academic Press.