Statistical Hypothesis Testing and Analysis

A researcher working for a drug company is interested in investigating the effect of storage on the potency of one of the company’s drugs. 10 freshly produced bottles of the drug were randomly selected and the potency measured. Another 10 freshly produced bottles of the drug were randomly chosen and stored under controlled conditions for six months before the potency was measured. Summary statistics are displayed below:

(a) Carry out a t?test to investigate whether there is a difference between the underlying mean potency of the freshly produced drug and the underlying mean potency of the drug when stored for six months.

(i) You must clearly show that you have followed the “Step?by?Step Guide to Performing a Hypothesis Test by Hand” given in the Lecture Workbook, Page 12, Chapter 7.

(ii) At steps 5 and 8 it is necessary to use the t?procedures tool on Canvas to determine the standard error and the t?multiplier. Look under: Assignments ® Assignment 3

(iii) At step 6 it is necessary to use either the t?procedures tool on Canvas, a graphics calculator, SPSS or Excel to determine the P?value.

(iv) Refer to the instructions on Page 1 of this assignment: “Hypothesis tests in this assignment”.

(b) Does the confidence interval given in part (a) contain the true value of the parameter? Briefly explain.

Horizon Research recently published the report `Generation KiwiSaver – Securing the Future of Young New Zealanders' which was based on a public opinion survey completed in July 20171. The survey was conducted online with a nationally?representative sample of 2199 New Zealanders aged 18 years or over (adult New Zealanders). You may assume that the survey used simple random sampling.

One question in the survey asked:

When you retire, how much are you expecting to have in the following investments? The table below gives these investments and shows the 469 responses to this question received from respondents aged 18 to 34 years.

Researchers2 were interested in finding out whether there really is a home game advantage for teams playing in the National Football League (NFL) in the USA. That is, they were interested in whether the home team scored more points, on average, than the away team. If there is a home game advantage they were also interested in estimating its size. Data were collected for a random sample of 256 games and for each game the scores for both the home and the away team were recorded.

Notes:

(i) SPSS and Excel files of the data are available on Canvas on the Home Page or look under Assignments ® Assignment 3.

Click on:

• NFLASPSS or NFLAINZIGHT

• NFLBSPSS or NFLBINZIGHT

(ii) To answer parts (c) and (d) you need to ensure that you use the file(s) which has the data in the form that is appropriate for the design of this study.

(a) For this study describe the:

(i) units,

(ii) treatment or factor of interest,

(iii) response.

(b) (i) Use iNZight to draw the appropriate box plot(s) for this data set. (You should consider the design of this study to ensure the relevant plot(s) is drawn.)

(ii) Comment on any features in the plot(s).

(c) Investigate whether, on average, the home team scores more points than the away team. Use SPSS to conduct a t?test. Interpret your results. (You should consider the design of this study to ensure the appropriate t?test is conducted.)

Notes:

(i) You must clearly show that you have followed steps 1, 2, 3, 7, 9 and 10 in the “Step?by?Step Guide to Performing a Hypothesis Test by Hand”, Lecture Workbook, page 12, Chapter 7. The other steps are replaced by your computer output, which you must hand in.

(ii) Refer to the instructions on Page 1 of this assignment: “Hypothesis tests in this assignment”.

(d) Comment on the validity of the t?procedures conducted in (c) by briefly discussing each assumption.

We wish to investigate whether, on average, the sodium (salt) content of breakfast cereals depends on the manufacturer: Hubbards, Kellogg’s and Sanitarium. Three random samples each of 20 breakfast cereals on sale in New Zealand were collected from the three manufacturers and the sodium content (in milligrams per 100 grams of cereal) was recorded.

Note:

Excel, CerealINZIGHT, and SPSS, CerealSPSS, files of the data are available on Canvas on the Home Page or look under Assignments ® Assignment 3.

(a) For this study describe the:

(i) units,

(ii) treatment or factor of interest,

(iii) response.

(b) (i) Use iNZight to draw the appropriate plots(s) for this data set.

(ii) Comment on any features in the plot(s) in terms of the original story.

(c) Using SPSS, provide the computer output of an F?test on these data.

Notes:

• Refer to the SPSS Tutorial, Pages 16 and 17, on Canvas. (Look under Software Information

and Help ® SPSS Help.)

• Ensure that you complete Step 1 through to Step 4 of the instructions on Pages 16 and 17.

(d) State the assumptions of the F?test in terms of the original story.

(e) Calculate the ratio of the largest sample standard deviation for the sodium content to the smallest sample standard deviation for the sodium content.

(f) Comment on the validity of the F?test by briefly discussing each assumption.

(g) Assume that an F?test is an appropriate test to use here.

(i) State the null hypothesis for the test, both in words and using symbols.

(ii) State the alternative hypothesis for the test in words.

(iii) What does the result of the F?test tell you about the underlying mean sodium content for the three manufacturers? Explain your answer in 1 to 2 sentences.

(h) (i) Assume the Tukey’s pairwise comparisons are valid and appropriate.

Investigate whether there is a difference between the underlying mean sodium content of breakfast cereals produced by Sanitarium and that of breakfast cereals produced by Kellogg’s. Interpret the P?value and confidence interval.

Note: A conclusion is not required here.

(ii) Between which pair (or pairs) of manufacturers were there significant differences (at the 5% level) in the underlying mean sodium content of breakfast cereals?

(iii) Are we able to determine which manufacturer has the lowest underlying mean sodium content of breakfast cereals? If so, name the manufacturer.

(i) In three to five sentences, provide an overall conclusion for this study.

As part of ongoing market research, a random sample of players who had registered a computer game online were surveyed six months after the game was released. Data were recorded for the following variables.

(a) For each of the scenarios 1 to 5 below:

(i) Write down the name of the variable(s), given in the table above, needed to examine the question.

(ii) For each variable in (i) write down its type (numeric or categorical).

(b) What tool(s) should you use to begin to investigate the scenarios 1 to 5 below? Write down the scenario number 1 to 5 followed by the appropriate tool. Hint: Refer to the blue notes in Chapter 1 in the Lecture Workbook.

(c) Given that the underlying assumptions are satisfied, which form of analysis below should be used in the investigation of each of the scenarios 1 to 5 below? Write down the scenario number 1 to 5 followed by the appropriate Code A to F.

Scenario 1: Is there a link between the age group of a player and the overall rating of the game?

Scenario 2: Do we expect the highest score that males achieve in the game to be different from the highest score that females achieve in the game?

Scenario 3: Is there a difference between the proportion of players aged 16 – 19 who play online and the proportion of players aged 20 – 29 who play online?

Scenario 4: Is there a difference between a player’s enjoyment rating and their playability rating?

Scenario 5: Does the presentation rating of the game depend on the computer operating system used?