TEXTBOOK - COMPUTATIONAL STATISTICS SECOND EDITION By GEOF H. GIVENS AND JENNIFER A. HOETING
Problem 1-
Consider the data set "Rat_data.csv" (Read the data using the function read.csv()) in R. Each data point represents the weight of a rat at a certain age. More specifically, weights of 30 rats (in grams) were measured at five time points of x1 = 8, x2 = 15, x3 = 22, x4 = 29, x5 = 36(weeks). For example, weight of rat 3 in week 15 was 214 grams. Let yij denote the weight of rat i at age xj, and consider the following model for the yij
Yij ~ N(αi + βi(xj - x-), σ2) for i = 1, . . . , 30 and j = 1, . . . , 5 and x- = 22
(a) Obtain the maximum likelihood estimate of αi, βi and σ2. Show your work, and compute the final estimates.
(b) Now consider a Bayesian analysis of this model, where we assume the following priors
αi ~ N(α, 100) with hyperprior α ~ N(0,10000)
βi ~ N(β, 1) with hyperprior β~ N(0,10000)
?? = (1/σ2) ~ gamma(10-3, 10-3)
where ~αi, βi and ?? are assumed to be independent. Our aim here is to use the Gibbs sampling method to generate values from the posterior distribution of (α1, . . . , α30, β1, . . . , β30, τ).
(i) State all the required conditionals. Show your derivation very briefly.
(ii) Write a complete algorithm for this Gibbs sampling algorithm, specific to this problem and code your algorithm, using R.
(iii) Run the chain for 50,000 iterations, and burn-in the first 10,000 generated values. Plot the chain (trace) after the burn-in for α1, β1, α15, β15, τ against the iteration number, and comment on the mixing.
(iv) Draw the histograms for the posterior distribution of α1, β1, α15, β15, τ and provide a point estimate for each parameter.
(c) Obtain a 95% credible interval for each of the parameters α1, β1. Interpret your intervals in the context of the problem.
(d) In the prior, we assumed that α1, β1 and τ are independent. Does the posterior support this assumption? Justify your answer.
Problem 2-
Consider simulation of the bivariate normal distribution with μ = (1, 2)T and covariance matrix
using Metropolis-Hastings algorithm with random walk generating density. Specifically, let x∗ = x(t)+ε, where ε ~ N2(0, D) with D being a diagonal matrix with diagonal elements δ1 and δ2. Note that δ1 and δ2 control the spread of values generated along the first and second coordinate axis, respectively.
(a) Give the M-H algorithm, and code it in R.
(b) Choose δ1 = 0.001 and δ2 = 0.001, and generate a chain, starting at x(0) = (0, 0)t, of size 10,000 using the M-H algorithm.
(i) Compute the acceptance rate, and comment on the acceptance rate.
(ii) Burn 1000, and draw a trace plot and the density plot for the chain for each of the variables. Comment on mixing and the density plots.
(iii) Draw 9000 pairs generated from multivariate normal with μ = (1, 2)T and covariance matrix
Plot these pairs on a scatterplot overlaying it with the last 9000 points obtained from the M-H chain; use different color for those generated from MH and those directly generated from the target (choosing an alpha-level for color transparency is helpful). Do the M-H generated values match the target? Explain why or why not.
(c) Choose appropriate δ1 and δ2 so that the M-H algorithm would generate values that meet the target requirement. Draw a similar scatter-plot as in part a(ii) for this case.
(d) How do you expect the acceptance rate would change, if you choose δ1 and δ2 to be very large? Briefly explain why?
I will need the assignment along with the R codes. R codes are extremely important in this course, please have some explanation in R file. Please try to avoid using built in packages if possible.
Attachment:- rats_data.rar