2. Note: Each student will get different answers as the data sets differ.
a. Using the assignment data file allocated to you and R Commander, tabulate the relationship between gender (sex) and most frequently used mode of transport in the past month (transport). Please use the results from R Commander to create table in Word with appropriate headings (the output from R Commander is poorly labelled and will not be accepted).
b. Using row or column percentages describe the relationship between gender and most frequently used mode of transport in the past month.
3. Note: Each student will get different answers as the data sets differ.
a. Using the assignment data file allocated to you and R Commander, graph the relationship between the number of activities attended in the past month (activities) and drivers licence status in this sample of 17-year-old Australians. This figure should be prepared in R Commander with appropriate axis labels then copied and pasted into your assignment answers with appropriate title.
b. Use appropriate statistics to describe the centre, spread and shape of the distribution of number of activities attended per month for each category of drivers licence status separately. You must clearly indicate which statistics describe the centre, which describe the spread and which describe the shape. Copying R Commander output is insufficient and should be avoided.
c. Using the results in parts a. and b. above, describe the relationship between number of activities attended in the past month and drivers licence status.
4. Note: Each student will get different answers as the data sets differ.
a. Using the assignment data file allocated to you and R Commander, draw an appropriate graph of the relationship between self-reported sedentary hours per week and number of activities attended in the past month. When preparing the graph in R Commander don’t forget to provide meaningful labels on the axes.
b. Using the graph on a. describe the form, direction and strength of the relationship between self-reported sedentary hours per week and number of activities attended in the past month.
5. A group of 8 students were asked about their age, gender and area of study. The responses (sorted on age) are shown in the following table:
initials
|
Age in years
|
Gender
|
Area of study
|
HP
|
17
|
Male
|
Nursing
|
RT
|
19
|
Male
|
Accounting
|
SK
|
20
|
Female
|
Psychology
|
KZ
|
20
|
Male
|
Psychology
|
AN
|
21
|
Female
|
Nursing
|
KK
|
22
|
Female
|
Psychology
|
JH
|
22
|
Male
|
Psychology
|
PV
|
25
|
Female
|
Nursing
|
a. If you select one person at random from this group, what is the probability this person will be 18 or more years of age?
b. If you selected one person at random from this group, what is the probability they will be a female who is studying psychology?
c. If you selected one female at random from this group, what is the probability they will be 21 or more years of age and studying nursing?
6. In Australia, the probability of having blood type B is 0.1.
a. Blood type was recorded for a random sample of 250 Australian adults. Using R Commander, what is the probability that this random sample of 250 adults will contain 25 or fewer people whose blood group is B?
b. Suppose 200 random samples were drawn and each of these 200 samples contained exactly 250 people. We would predict 12% of all samples to contain fewer than how many people with type B blood?
c. Estimate the mean number of people with type B blood per sample. Show any working.
7. The hours of sleep per night for 17 year olds is known to be Normally distributed with mean 8.2 hours and standard deviation of 0.6 hours. Using this information to address the following questions. Show any working.
a. In 17 year olds, how many hours sleep corresponds with a z-score of 1.
b. Choose one 17-year old at random from this population. Using R Commander, estimate the probability that this person sleeps between 7.5 and 8.0 hours per night?
c. Choose a random sample of sixteen 17-year olds. Using R Commander, estimate the probability that the sample mean for normal nights will lie between 7.5 and 8.0 hours per night? Show any working.
d. Choose random sample of sixteen 17 years olds, how many of this group would you expect to sleep between 7.5 and 8.0 hours.
Answer
Biostatistics
2. Part a
Gender
|
Mode of Transport
|
Driver
|
Passenger
|
Other
|
Female
|
28
|
61
|
39
|
Male
|
52
|
62
|
29
|
Part b
Gender
|
Mode of Transport
|
Total
|
Driver
|
Passenger
|
Other
|
Female
|
28
|
61
|
39
|
128
|
21.9%
|
47.7%
|
30.5%
|
100%
|
Male
|
52
|
62
|
29
|
143
|
36.4%
|
43.4%
|
20.3%
|
100%
|
The row percentage is used to review the relationship between gender and mode of transport. It is seen that the passenger mode of transport has been the most frequently used by both females (47.7%) and males (43.4%).
3. Part a
Part b
Statistics
|
Not-licenced
|
Learners permit
|
Licenced
|
Mean
|
6.550725
|
6.293333
|
8.370079
|
Standard Deviation
|
2.179694
|
2.252766
|
2.107444
|
Minimum
|
3
|
2
|
3
|
1st Quartile
|
5
|
4
|
7
|
Median
|
6
|
6
|
9
|
3rd Quartile
|
8
|
8
|
10
|
Maximum
|
11
|
12
|
13
|
Range
|
8
|
10
|
10
|
IQR
|
3
|
4
|
3
|
Skewness
|
0.135572
|
0.3590054
|
-0.2056677
|
The above table represents the descriptive statistics for the number of activities and driver’s licence Mean and median values represent the centre of the distribution. The spread of the distribution is represented by both range and IQR. The shape of the distribution is provided by skewness.
Part c
From the graph in “part a” and table in “part b” it can be inferred that the average number of activities attended by licenced drivers is more than Not-licenced and Learners permit holders. In addition, the median number activities attended by licenced drivers is also more than Not-licensed and Learners permit holders. The range of Licenced drivers and learners permit is more than not-licenced drivers. Moreover, while the activities attended by Licenced drivers is right skewed, the number of activities attended by Not-licenced and Learners permit is left skewed.
4. Part a
Part b
From the above plot it can be inferred that with decrease in number of sed there is an increase in the number of activities.
5. Part a
From the table it is found that 7 out of 8 persons are more than 18 years of age.
Thus, if a person is selected at random then the probability that he would be more than 18 years of age = 1/8 = 0.875
Part b
The group
Gender
|
Area of Study
|
Total
|
Accounting
|
Nursing
|
Psychology
|
Female
|
0
|
2
|
2
|
4
|
Male
|
1
|
1
|
2
|
4
|
Total
|
1
|
3
|
4
|
8
|
From the above table it is seen that the total number of people = 8
Number of females studying psychology = 2
Thus, the probability that a person chosen randomly would be female and studying psychology =2/8 = 0.25
Part c
Number of Students 21 years or more = 4
For students 21 years and more
Gender
|
Area of Study
|
Total
|
Accounting
|
Nursing
|
Psychology
|
Female
|
0
|
2
|
1
|
3
|
Male
|
0
|
0
|
1
|
1
|
Total
|
0
|
2
|
2
|
4
|
From the above table it is seen that the total number of females = 3
Number of females studying Nursing = 2
Thus, the probability that a female chosen randomly would be 21 years or more and studying Nursing =2/3 =0.67
6. Part a
The probability is given as
![]()
![]()
Thus, the random sample of 250 adults will contain 25 or fewer people with blood group B = 0.5
![]()
Part b
Let the number of people with type B Blood be x
Each, sample contains 250 people.
Thus the proportion of people
![]()
The proportion of people with blood type B = 0.1
Thus, there are fewer than 26 people with blood type B in a sample
Part c
Thus, the mean number of people per sample with blood type B = 26*0.1 = 2.6
7. Part a
z-score = 1
mean number of hours of sleep = 8.2
standard deviation of hours of sleep = 0.6
![]()
Thus, 8.8 hours of sleep corresponds to z-score of 1
Part b
The probability that the persons sleeps between 8.0 and 7.5 hours 0.2477688
![]()
Part c
The probability is given as P(7.5<x<8.0) when the sample size is 16
The standard error
![]()
![]()
Thus the probability that the sample mean would lie between 7.5 and 8.0 hours = 0.0912
Part d
The probability = 0.0912
The sample size = 16
Thus the number in the groups = 16*0.0912 = 1.4592 ≈1