Get Instant Help From 5000+ Experts For

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Statistics Homework Assignment - Questions 1 to 3

Question 1

Question 1

We expect the number of doctors in a city to be related to the number of hospitals, reflecting both the size of the city and the general level of medical care.

The JMP data file Hospital contains information on 83 metropolitan areas in the U.S. that have at least two community hospitals. We are interested in the relationship between the number of doctors “NumMDs” (Y variable) and the number of hospitals “NumHospitals” (X variable).

1a) Use Graph → Graph Builder on JMP to create a scatterplot of the number of doctors “NumMDs” (Y variable) against the number of hospitals “NumHospitals” (X variable). Highlight and drag NumMDs to the Y-axis, and NumHospitals to the X-axis. Use the icons at the top to remove any curve overlays and leave you with a basic scatter plot. Copy and paste this graph into your document, suitably titled.

1b) Use your graph to describe the relationship between the number of doctors and number of hospitals.

1c) Use Analyze →Fit Y by X and the “Fit Line” procedure on JMP to produce the line of best fit plot and Summary of Fit information. Copy and paste both these portions of the output into your document.

1d) Using the information from part c, what does the R2 value tell you about this fit?

1e) Use JMP’s lower red triangle and the Plot Residuals command to create the following two residual plots: Residual by Predicted plot and the Normal Quantile plot. Copy these two graphs into your document.

1f) Now use the information from these residual plots to comment on the adequacy of the model fit.

Transforming the data

We would now like to use a transformation on the original data to try to improve the model fit.  Use JMP to create a column containing the square root of the “NumMDs” variable.

Recall: To create a new column containing the transformed data, follow the steps below.

i)Click on the y by x icon  at the top of the screen or Fit Y by X

ii)Right-click on the variable name and apply the appropriate transformation.

iii)Right–click on the transformed variable name and add this column to the data set.

For questions (1g)-(1l), repeat questions (1a)-(1f) from above using the transformed variable, Square Root (NumMDs), as the new response variable and retaining the number of hospitals “NumHospitals” as the explanatory variable.

Prediction

1m) Use the equation from your transformed linear model to predict the number of doctors in Louisville KY which has 18 community hospitals. Round to the nearest whole number.

## Question 2

1n) Compare your predicted value from part 1m to the actual value for Louisville. What is the residual for Louisville?

Question 2

Researchers have studied factors that might be associated with the jumping performance of domestic cats. They studied 18 cats selected at random, using takeoff velocity (in centimeters per second) as the response variable. They used body mass (in grams), hind limb length (in centimeters), muscle mass (in grams), percent body fat and sex as potential explanatory variables. The data can be found in the JMP data file Cat Jumping.

2a) We will first use one-way ANOVA to test whether sex significantly affects the mean takeoff velocity of the cats. Select Analyze → Fit Y by X, place velocity in the Y box, the categorical sex variable in the X Factor box and hit OK. Copy and paste the resulting graphical display into your document.

2b) State the null and alternative hypotheses for this research in the context of the question.

2c) Using the red triangle, select Means/Anova to conduct an analysis of variance on the data. Copy and paste the Analysis of Variance table into your document.

2d) Use your output from part 2c to make a decision concerning your hypotheses from part 2b. Justify your decision at the α = 0.05 level of significance.

2e) Now we will investigate correlations between the response variable velocity and the four quantitative explanatory variables of bodymass, hindlimb, musclemass and percentbodyfat by following the steps below:

Analyze → Multivariate Methods → Multivariate

Enter all five variables into the Y box

Click OK

Copy and paste the Correlations table into you document.

2f) You will notice that one of the explanatory variables is quite strongly correlated with the response variable and one has a particularly low correlation with the response variable. Name these two variables and provide values.

2g) looking at the correlations between the four explanatory variables, do you see any possible instances of multicollinearity occurring?  Explain and provide values.

2h) We will now fit a multiple regression model by using all four quantitative explanatory variables by following the steps below.

Analyze → Fit Model.

Enter velocity as the Y variable

Add the four quantitative explanatory variables in the “Construct Model Effects” box

Keep the default Personality as Standard Least Squares, and change the default Emphasis to Minimal Report.

Click Run

Copy and paste the following three tables into you document: “Summary of Fit”, “Analysis of Variance” and “Parameter Estimates” and use these tables to answer the following questions.

2i) Write the equation for predicting velocity from the explanatory variables. (Use three decimal places for all coefficient estimates.)

2j) What does the coefficient of the explanatory variable hindlimb mean in this model?

2k) State the value of the F-statistic and its p-value using the “Analysis of Variance” table and use these to comment on the adequacy of the overall model.

2l) Use the “Parameter Estimates” table to determine which of the explanatory variables are significant in the model. Explain by referring to the p-values for the individual variables.

2m) Evaluate, numerically, how well the model fits the data using the “Summary of Fit” table.

Question 3

3a) We will now use Stepwise Regression to let JMP select the “best” model.

Enter the variables as in 2h but now select Stepwise in the Personality box and click Run.  Now select the Backward direction, click on Enter All and click Go. Finally, click on the Run Model box at the top to select the “best” model.

Copy and paste the following three tables into you document: “Summary of Fit”, “Analysis of Variance” and “Parameter Estimates” and use these tables to answer the following four questions.

3b) Write the equation for predicting velocity from the explanatory variables selected by the Stepwise routine. (Use three decimal places for all coefficient estimates.)

3c) Use the adjusted R-squared value from this model to compare its fit to the fit obtained from the model in part 2m. Comment on any differences you observe.

3d) Refer back to your answer to part 2g, and comment on why JMP might have excluded variable(s) when selecting its “best” model through the Stepwise routine.

3e) Finally we will run assumption checks for this JMP Stepwise model. Click on the lower red triangle by Response Velocity and then click on Row Diagnostics to request the “Residual by Predicted Plot” and “Residual Normal Quantile Plot”. Copy and paste these two plots into your document. Use these residual plots to determine if this model is appropriate i.e. does it appear to satisfy the assumption checks?