As part of a study, researchers examined the relationship between claw size (propodus height in mm) and its closing force (in Newtons) for three intertidal crab species: Hemigrapsus nudus (HN), Lophopanopeus bellus (LB), and Cancer productus (CP). The raw data can be seen in Display 7.15 (p. 201). The following code will load the data, group it into four different subsets: ALL (all three species), HN (Hemigrapsus nudus only), LB (Lophopanopeus bellus only), and CP (Cancer productus only), create linear regression models, and produce diagnostic plots.
(1)Use the diagnostic plots for ALL species (the first set of diagnostic plots) to assess whether the linear model (ALL.lm) seems to fit the data well. Show the plots and discuss the fit in terms of the Residuals vs Fitted plot (i.e., outliers and non-constant variance) and in terms of the Normal Q-Q plot (i.e. normality and model fit).
(2)Run the following code to plot the data points for all species and add the regression line. What are your impressions about the fit of the line to the data?
(3)Run summary(ALL.lm) to get the summary for the linear model using all species. What does this summary say about whether the slope and the intercept in this model are equal to zero? Show the R output and explain.
(4)What is the equation for the best fitting line when using all species?
(5)If a crab (unknown species) had a propodus height of 10.5 mm, what would you estimate the closing force to be? Show your work.
(6)Suppose the researchers found a particularly large crab with a propodus height of 16 mm. Can this dataset be used to estimate the closing force? If yes, what would that force be? If no, why not?
(7)What is the correlation between propodus height and closing force when using data for all species? Explain how you arrived at your answer.
Questions 8–14 use the data subsets. Read carefully to make sure you are using the correct subset.
(8)Run the following code to get the linear model summary for each species. Report the intercept (β0) and the slope (β1) for each species in a table.
(9)What do the values of the intercepts (β0) and slopes (β1) in question 8 say about the feasibility of using a single model (ALL.lm) to model the data? Explain your reasoning.
(10)What is the Multiple R-squared value for the HN subset? What does this value say about the adequacy of the HN.lm model? Is it a good model or not? Explain.
(11)Look at the diagnostic plots for the HN subset (the second set of diagnostic plots). What do the Residuals vs Fitted and the Normal Q-Q plots say about this model? Does the model fit well or not? Show the plots and explain.
(12)Run the following code to plot the data points for the HN subset and add the regression line. What does this plot suggest about the relationship between propodus height and closing force for Hemigrapsus nudus crabs? Show the plot and explain.
(13)Run the following code to create a confidence interval for the intercept (β0) and slope (β1) for the HN subset, and to get the model summary. What does the confidence interval indicate about plausibility of zero as a value for the slope and the intercept? Can your answer be confirmed with the summary output? Show the R output and explain.
(14)Run the following code to create a plot of the data points and dotted regression lines color-coded by species (HN = blue, LB = red, and CP = black). In which species does force increase the most rapidly as height increases? Show the plot and explain.
(15)Explain one way that we could improve the analysis of this study? Note – there are many possible answers.