Get Instant Help From 5000+ Experts For
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing:Proofread your work by experts and improve grade at Lowest cost

And Improve Your Grades
myassignmenthelp.com
loader
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Guaranteed Higher Grade!
Free Quote
wave
STATS 330 Advanced Statistical Modelling

Questions:
Notes:
(i) Write your assignment using R Markdown. Knit your report to either a Word or PDF document.
 
(ii) Create a section for each question. Include all relevant code and output in the final document.
 
(iii) Marks may be deducted for poor style. Please keep your code and plots neat.
 
(iv) Please remember to hand in your hard copy, with signed cover sheet, by the due date.
 
1. This question will use parametric bootstrapping (simulation based on a fitted model) to explore the distribution of deviances for a Poisson regression and for a negative binomial regression. Recall that the sampling distribution for the residual deviance is approximately a Chi-squared with degrees of freedom given by the residual. In some cases, it may not be clear whether or not this approximation is reasonable. A parametric bootstrap can be used to evaluate how accurate this approximation is and (if required) provide a more appropriate sampling distribution.
 
(a) You will recall that in Question 1 of Assignment 2 you fitted a Poisson modelmod.poss and a negative binomial model mod.nb to the Billboard data. Repro- duce the ungrouped data vector y and recreate the model objects mod.poss and. Obtain the residual deviance and degrees of freedom for each model.
 
(b) Suppose that the true relationship really is a Poisson distribution which has mean value of μ = exp(β0) =  ?y and simulate 10,000 residual deviances. These deviances provide us with an empirical sampling distribution for the residual deviance. Now produce a histogram of these deviances and superimpose the density function for the Chi-squared approximation. You may use the following code.
 
Comment, briefly, on what you see and suggest a reason why these two distributions do not exactly ‘line up’. Hint: when does the Chi-squared distribution approximation work for modelling Poisson count data? For this situation, what would be the consequences of using the Chi-square approximation to calculate a p-value for a goodness of fit test?
 
(c) Now do a new goodness of fit test to for the Poisson model. First, calculate the p-value using the empirical sampling distribution for the residual deviance based on your simulations. How does it compare to the p-value calculated using the Chi-square approximation?

2. When you created an empirical sampling distribution for the residual deviance for the Poisson regression model using the parametric bootstrap. Now use the non-parametric bootstrap to create a sampling distribution for the residual deviance and produce a histogram. Compare this histogram to the produced by parametric bootstrapping. Explain why these distributions are so different.
 
A quasar is a distant celestial object that is at least four billion light-years away from earth. The Astronomical Journal (Schmidt, M., Schneider, D. P., and Gunn, J. E., 1995) reported a study of quasars detected by a deep space survey. Based on the radiations provided by each quasar, astronomers were able to measure several of its quantitative characteristics, including;
 
• RFwidth —rest frame equivalent width - our response variable.
 
• Redshift —-the red-shift range
 
• Line — line flux in erg/cm,
 
• Luminosity — line luminosity in erg/s ,
 
• AB1450 —- absolute magnitude,
 
The objective of the study is to model the rest frame equivalent width (RFwidth) using the other variables measured.
 
3. In this question we will get you to find the ‘best’ model to fit astronomical data associated with measuring quasars.
 
(a) Use the R-function pairs20x to investigate the variables and the relationships between them. In particular, discuss why logging RFWidth is appropriate and why it may not be necessary to use all of the other variables to model RFwidth
 
(b) Explain why the AICc is a more appropriate measure of model adequacy than AIC for this data. Use the package MuMin and its function dredge to produce a list of the 5 best fitting models based on AICc. Is the best model in this list clearly better than the others (explain your answer)?
 
(c) Apply the usual diagnostic procedures to the best fitting model. Summarise your findings.
 
(d) Write a brief executive summarising how RFWidth is related to other variables based on the best fitting model.

support
close