1 (a) A study was conducted to analyse the relationship between mathematical ability and musical ability for a group of 10 lower-primary school students. The data is shown in Table Q1(a).
(i) Sketch the scatterplot and present your observation.
(ii) Analyse and compute the Pearson correlation coefficient.
(iii) Assuming a linear relationship with music score as the dependant variable, determine the least square linear regression line.
(iv) Based on the model, predict the musical ability for a score of 7.5 on the mathematical ability.
(b) An insurance company wishes to estimate the average claim amount for a type of medical procedure. It was found that a random sample of 40 claims had a mean claim of $18250. Suppose the standard deviation of all claim amounts for this type of medical procedure is $1660.
(i) Construct a 90 % confidence interval for the average claim amount of all medical procedures of this type. Report any assumptions in your analysis.
(ii) Discuss what will happen to the width of the confidence interval if the confidence level is increased to 99 %.
2 (a) Table Q2(a) depicts the results from an experimental study to investigate the relationship between the age of an infant and the amount of eye contact the infant makes with the mother. The infants were six months old and nine months old at the time of testing. The data scores in Table Q2(a) denote the number of one-minute segments during which the infant made any eye contact with the mother over a ten-minute session.
At α = 1 %, apply a paired t-test to determine whether there is any significance difference between the infants’ scores at six and nine months old. Comment on the results and interpret the p-value.
(b) Suppose the electricity bills for all households have a skewed distribution with a mean of $145 and a standard deviation of $30. Compute the probability that the mean electricity bills for a random sample of 100 households will be:
(i) more than $138.
(ii) between $140 to $152.
(iii) within $7 of the population mean.
3 (a) Suppose XYZ company manufactures a type of product using two machines, A and B. Based on a sampling plan, a quality assurance inspector takes a random sample of 200 units of this product and checks them for being good or defectives. The results are shown in Table Q3(a). At α = 5 %, apply a suitable hypothesis test to determine whether this sample provides sufficient evidence that the two attributes, the machine type and the product being good or defective, are independent.
b) A psychologist conducted an experiment to test people’s sense of direction by leading them through a maze in a building. 80 participants were asked to identify which of the four directions is East. The maze is so complex that all participants simply guessed the direction.
(i) Determine the probability that a participant guess the direction correctly.
(ii) Calculate the expected number of participants who guessed the directions correctly.
(iii) Apply the normal approximation method and compute the probability that at least 25 participants correctly identified the direction.
(iv) Apply the normal approximation method and compute the probability that fewer than 10 participants correctly identified the direction.
4 (a) In an observational health study of Alzheimer’s disease (AD) , the age data was collected from 10 AD patients exhibiting moderate dementia and a group of 14
individuals without AD, as shown in Table Q4(a). The data denotes the patients’ ages in years.
(i) Based on α = 5 %, apply a two-sample t-test to examine whether there is a significant age difference between the AD and non-AD groups. Comment on the results.
(ii) Examine the p-value for this test.
(iii) Construct a 99 % confidence interval for the difference in mean age between the AD and non-AD groups.
(b) Calculate the missing values in the one-way ANOVA shown in Table Q4(b).
5 (a) Consider a random sample of 12 bus journey time from bus stop A to bus stop E shown in Table Q5(a). Suppose the bus journey time is normally distributed.
(i) Calculate the unbiased estimate of the mean of the population from which this sample was drawn.
(ii) At α = 5 %, use R to perform a one-sample t-test to determine whether there is sufficient evidence to support the claim that the mean journey time exceeds 16. Outline the R code.
(b) Table Q5(b) shows the assessment rating scores of a new product from two independent samples of judges comprising professionals and laypersons. Assuming that the large-sample normal approximation method can be used and based on α = 5 %, apply a suitable non-parametric test to determine whether there is any difference in the mean assessment rating scores between professionals and laypersons.
(c) The number of flaws observed on a large coil of galvanized steel is hypothesised to follow Poisson distribution. A random sample of 60 coils were inspected for the number of flaws. Compute the unknown expected frequency for the cell in Table Q5(c).