This question relates to stratified sampling
You should be familiar with this following dataset from your assignment 2.
Copy these following three lines to R. Make sure you are connected to the internet while running these lines. It will load the “crabs” dataset to your working environment.
if (!require("glmbb")){install.packages("glmbb")}
library(glmbb)
data(crabs)
Here is a little summary of the dataset. https://rdrr.io/cran/glmbb/man/crabs.html
Treat this dataset as our population. For this question we will use “satell” as our response variable(Y) and “weight” as an auxiliary variable(X) and “color” as the variable to define strata.
You will find four different colors: dark, darker, light and medium. Treat these as four different strata four your analysis.
We are still interested in the average of the variable “satell” which is the number of satellites around a female crab.
a. Write a function in R that
b. Replicate this function 1000 (or more) times and compare the performance of the two estimators: “y_bar_st” and “y_bar_st_ratio”.
c. What will be your recommendation based on your finding.
This question relates to cluster and multistage sampling
We will continue to use the “crabs” dataset from question 1 as our population
This question relates to single factor experiment
Suppose in an experiment, the effect of the amount of baking powder in a biscuit dough upon the rise heights of the biscuits is of interest. Four levels of of baking powder were tested and four replicate biscuits were made with each level in random order. The data are given below.
0.25 tsp |
0.5 tsp |
0.75 tsp |
1 tsp |
11.4 |
27.8 |
47.6 |
61.6 |
11.0 |
29.2 |
47.0 |
62.4 |
11.3 |
26.8 |
47.3 |
63.0 |
9.5 |
26.0 |
45.5 |
63.9 |