Questions:
1: Alcohol and nicotine consumption during pregnancy may harm children. Because drinking and smoking behaviours may be related, it is important to understand the nature of this relationship when assessing the possible effects on children. One study classified 452 mothers according to their alcohol intake prior to pregnancy recognition and their nicotine intake during pregnancy. The data are summarized in the following table (from Ann P. Streissguth et al., “Intrauterine alcohol and nicotine exposure: attention and reaction time in 4?year?old children,” Developmental Psychology, 20 (1984), pp. 533?541):
Carry out a complete analysis of the association between alcohol and nicotine consumption. That is, describe the nature and strength of this association and assess its statistical significance. You may include figures, if required, to display the associations.
2. Human immunodeficiency virus (HIV) counselling and testing is recognized as the gateway to HIV prevention and access to antiretroviral treatment (ART), care and support. In Sub-Saharan African countries, including South Africa, uptake of HIV testing among men remains a major challenge. The factors implicated in reduced likelihood of HIV testing in research studies include age, racial group, employment status, being single, place of residence, and sexual behaviours. A researcher presents you with the following data from a cross sectional survey of 112 men in a South African village that she has just completed. The data she collected included HIV testing in the last 12 months, together with socioeconomic and demographic information. She is particularly interested in HIV testing and the impact of unemployment, but also has other questions of interest.
a. The researcher wishes to know if undergoing HIV testing is associated with employment status.
i. What proportion of the population sampled underwent HIV testing?
ii. What proportion of those unemployed underwent HIV testing?
iii. What proportion of those employed underwent HIV testing?
iv. Based on i., ii. and iii., state whether you believe undergoing HIV testing is associated with employment status and give your reasons in a short answer.
v. Is undergoing HIV testing associated with employment status? Present a complete analysis – state your hypothesis, compute test statistic, p value and 95% CI and interpretation.
b. Using the fact that the odds ratio (OR) for the association between HIV testing and being employed is 2.15 with 95% CI (1.01, 4.62), report to the researcher your conclusion to her questions about HIV testing and the impact of employment status.
c. Before the researcher presents these results publicly, what should she be thinking of doing to strengthen them or make them more reliable? (Think about any weaknesses in the analysis so far.)
3.The therapeutic efficacy and tolerability of calcipotriol ointment and betamethasone ointment in psoriasis were compared in a multicentre, prospective, randomised, double-blind, right/left trial. 345 patients with psoriasis vulgaris were treated twice daily for 6 weeks with calcipotriol ointment and betamethasone ointment randomly assigned to opposite sides of the body. The severity of the condition was assessed by selfreport as “cleared”, “pronounced improvement”, “slight improvement”, “no change”, or “worse” and by the psoriasis area and severity index (PASI), scored by the investigator.
The reduction in PASI score was significantly lower (p < 0.001) for the calcipotriol side (mean reduction 68.8%) than the betamethasone side (mean reduction 61.4%). Patients were more likely to report “cleared” or “pronounced improvement” on the calcipotriol side (Kragballe et al., 1991).
a. Suggest 2 methods that could be used to carry out the significance test on the PASI score.
b. The self-assessment was recoded into two categories, “cleared or pronounced improvement” or “little or no improvement”. What method could be used to carry out the significance test for the recoded data?
Answers:
Introduction
Biostatistics is the application of statistical methods to biological sciences. This paper examines a number of crucial areas of biostatistics. These areas include hypothesis testing, measures of association and significance testing. These areas are examined through a series of questions. The three questions each covers one of the areas highlighted above. The first question covers measures of association, question two covers proportion and hypothesis testing and the third question covers significance testing. The paper answers these questions and gives a brief description of each area.
Measures of association
Measures of association are used to quantify the relationship between two or more variables. (Dicker, Coronado & Koo, 2006). Several methods of analysis can be used to determine measures of associations. These include correlation analysis and regression analysis (Gaddis, 1990). Examples of such methods of analysis include Pearson’s correlation coefficient, Spearman order correlation coefficient, chi-square test, relative and odd ratio (Haung, 2016). To determine the association between alcohol and nicotine consumption we shall carry out a chi-square test on the data provided. A chi-square test is a measure of significance rather than a measure of the strength of the association.
1.Chi-square test
Null hypothesis: there is no association between alcohol and nicotine consumption.
H0: X2=0
H1: X2>0
Expected values = (row total x column total)/ grand total
Alcohol, ounces/day
|
Nicotine (milligrams/day)
|
|
none
|
1-15
|
16- more
|
Total
|
None
|
82.73
|
17.69
|
22.59
|
123
|
0.01-0.10
|
51.12
|
10.93
|
13.96
|
76
|
0.11-0.99
|
109.63
|
23.44
|
29.93
|
163
|
1.00 or more
|
60.53
|
12.94
|
16.53
|
90
|
Total
|
304
|
65
|
83
|
452
|
X2 statistic = (O-E) 2 /E
Alcohol, ounces/day
|
Nicotine (milligrams/day)
|
|
none
|
1-15
|
16- more
|
Total
|
None
|
6.00
|
6.46
|
5.95
|
|
0.01-0.10
|
0.93
|
3.22
|
0.07
|
|
0.11-0.99
|
5.99
|
7.84
|
4.87
|
|
1.00 or more
|
0.21
|
0.72
|
0.01
|
|
? X2
|
|
|
|
42.27
|
ΣX2 = 42.27
Degree of freedom = (r-1)(c-1)
= 2 x 2
= 4
Finding the critical value:
With P value of 0.05 and a d.f. of 4 the X2critical = 2.78.
From the above data, calculated X2 is greater than the critical X2, reject the H0 hypothesis. There is, therefore, a strong association between alcohol and nicotine consumption.
The strength of the association
Most measures of association have a scale such that they reach a value of 1 if the two variables have a perfect association with each other (Daniel, 2009). Methods used to measure the magnitude of association include Cramer’s V, Lambda, Phi, Kendall’s taub and Kendall’s tauc (Cohen J, 2010). Chi-square does not show the strength of the association between the variables.
2.Proportion and hypothesis testing
Proportion is the comparison of a part, a number or share in relation to the whole. It is a type of a ratio in which the numerator is part of the denominator. (Dicker, Coronado & Koo 2006). Proportion can be expressed as a decimal, fraction or a percentage.
Proportion is calculated by
a.The proportion of the sample that underwent HIV testing: =
= 0.4375
The proportion of those unemployed who underwent HIV testing
=0.339The proportion of the employed who underwent HIV testing =
= 0.5254
To determine whether undergoing HIV testing is associated with employment status using the above proportions we have to calculate relative proportion and attributable proportion. Relative proportion compares the proportion of the employed who underwent HIV testing to the proportion of the unemployed who underwent HIV testing. (Jorn, Murray& Ander, 2010) It shows how many times the employed underwent HIV testing compared to the unemployed
Relative proportion =
= 1.5471
From the calculation above the employed are 1.5471 times likely to attend HIV testing compared to the unemployed.
Attributable proportion is the measure of the health impact of a causative factor (Dicker, Coronado & Koo, 2006). Attributable proportion shows what percentage of the HIV testing attendance is associated with employment status.
Attributable proportion =
=
= 0.3540
= 0.3540 x100
= 35.40%
From the above calculation, only 35.40% of the HIV testing attendance is attributed to employment status. The remaining 64.60% is attributed to other factors. Therefore, undergoing HIV testing is not associated employment status.
A hypothesis is a statement about a characteristic of a population. It concerns itself with parameters of the population about which the hypothesis statement is made (Rosner, 2010). Hypothesis testing helps the researcher to reach a conclusion about the population by examining a sample from the population. (Rosner, 2010). A null hypothesis is a statement made so as to be put to test. An alternative hypothesis is a statement believed to be true if the testing done led to the rejection of the null hypothesis.
Null hypothesis: Undergoing HIV testing is not associated with employment status
H0: p1 = p2
H1: p1p2
P = ( x
= 0.2768 + 0.1607
= 0.4375
q = 0.5625
Z=
= 1. 979
From the z tables, P-value corresponding to Z = 1.979 is 0. 0478. This value is less than 0.05 and therefore we reject the null hypothesis. Therefore, there is an association between undergoing HIV testing and employment status.
Confidence Interval is a pair of numerical values that define an interval that include a certain parameter to a specific degree of confidence.
C.I
= 0. 1858 1.96
= 0.1858 0.01657
= 0.1692, 0.2024
Therefore, we are 95% confident that the true population difference between the proportion of employed men undergoing HIV testing and the unemployed men undergoing HIV testing lies between 0.1692 and 0.2024.
2.b.The odds ratio for the association between HIV testing and being employed is 2.15. This implies that there is a 2.15 time more likely for an employed person to undergo HIV testing than an unemployed person.
2.c.The researcher needs to increase the sample size. This will authenticate the association between undergoing HIV testing and employment status. A larger sample can be used to generalize the conclusion to the entire target population. A small sample size may lead to bias or a false association between the variables. The researcher should also do more measures of association to ascertain the relationship between undergoing HIV testing and employment status. The P value obtained from the hypothesis testing was 0.0478. This denotes a weak association which may have come about by chance.
3.Significance testing
- Methods that can be used to carry out significance test on the PASI score include unpaired t-test and analysis of variance, ANOVA. This is because the data provided by PASI score is numerical as opposed to categorical (Parikh & Hazra, 2010). The data provided by PASI score is also not paired. This is because it was taken at the end of the six weeks. If the data was paired, paired t-test and repeated measured ANOVA could have been used (Wang, Clayton & Bakhai, 2006). From the data given, we cannot determine a correlation association between the data hence Pearson’s coefficient and Spearman’s coefficient cannot be used (Petrie& Sabin, 2005).
- Methods that can be used to carry out significance test on the self-assessment are chi-square test. This is because the data provided by the self-assessment is categorical as opposed to numerical (Barun & Hazra, 2011). The data provided is not paired hence McNemar’s test cannot be used (Wang, Clayton & Bakhai, 2006). If an association between the parameters are established then odd ratio/risk ratio might be used (Petrie & Sabin 2005)
Conclusion
Measures of associations quantify the relationship between two or more variables. They show whether the association between two or more variables exists and if so to what strength. Chi-square does not show the strength of the association between the variables. Proportion is a ratio in which the numerator is a subset of the denominator. The two methods by which conclusion can be made about a population from a sample is by finding the confidence interval and hypothesis testing. Deciding what method of significance testing to use involves severally steps. The steps include deciding whether the data given is numeral or categorical, paired or non-paired and whether or not there is an association between the variables
References
Bernard Rosner (2010). Fundamental of Biostatistics. Boston: Brooks/ Cole.
Cohen, J., & Nee, J. C. (2010). Estimators for two measures of association for set correlation. Educational and Psychological Measurement, 44(4), 907-917.
Dicker, R., Coronado, F., & Koo, D. (2006). Principles of Epidemiology in Public Health Practice: An introduction to applied epidemiology and biostatistics. Waldorf: PHF.
Olsen, J., Christensen, K., Murray, J., & Ekbom, A. (2010). An Introduction to Epidemiology for Health Professionals. New York: Springer
Parikh, M., N., Hazra, A., Mukherjee, J., Gogtay, N. (2010). Research methodology simplified: Hypothesis testing and choice of statistical test. New Delhi: Jaypee brothers
Petrie, A., Sabin, C. (2005). Medical statistics at a glance. The theory of linear regression and performing a linear regression analysis. London: Blackwell Publishers.
Wang, D., Clayton, T., & Bakhai, A. (2006). A practical guide to design, analysis, and reporting. London: Remedica.
Wayne, W., D. (2009). Biostatistics: A foundation for analysis in health sciences. Danvers: Wile