Logo

AskSia

Plus

Coursework 2: Independent Sampler and RWM To be submitted online via Blackboa...
Mar 19, 2024
Coursework 2: Independent Sampler and RWM To be submitted online via Blackboard Turnitin by 10am, Thursday 21 March. Submitted work should be a pdf file containing your full solutions, including R\mathrm{R} programs and code to run the R\mathrm{R} programs, additional text and figures, and it should be presented as a concise report. This assignment counts for 12.5%12.5 \% of the final mark for the course and should take, on average, 10 hours to complete. The number of children of widowed women in Manchester, in 2023, is provided in the table below: \begin{tabular}{l|c|c|c|c|c|c|c} No. of children & 0 & 1 & 2 & 3 & 4 & 5 & 6 \\ \hline Observed no. of widows & 52 & 10 & 25 & 8 & 7 & 3 & 0 \end{tabular} For many data sets consisting of count data, there are more zeros in the data than can be accounted for by a Poisson model. Therefore a zero-inflated Poisson model is often used. Let XX be the number of children per widow; then the data x1,x2,,xnx_{1}, x_{2}, \ldots, x_{n} is assumed to be iid realisations from X=D{Po(λ)amp; with probability 1ϵ0amp; with probability ϵ X \stackrel{D}{=}\left\{\begin{array}{ll} \operatorname{Po}(\lambda) & \text { with probability } 1-\epsilon \\ 0 & \text { with probability } \epsilon \end{array}\right. where (λ,ϵ)(\lambda, \epsilon) are parameters to be estimated. This is an example of a mixture distribution, where XX is distributed either according to distribution APo(λ)A \sim \operatorname{Po}(\lambda) or distribution B0B \equiv 0. Therefore P(X=0)=ϵ+(1ϵ)exp(λ)P(X=0)=\epsilon+(1-\epsilon) \exp (-\lambda) and for k=1,2,k=1,2, \ldots, P(X=k)=(1ϵ)×λkk!exp(λ)P(X=k)=(1-\epsilon) \times \frac{\lambda^{k}}{k !} \exp (-\lambda). For k=0,1,k=0,1, \ldots, let nk=j=1n1{xj=k}n_{k}=\sum_{j=1}^{n} 1_{\left\{x_{j}=k\right\}} denote the total number of xjx_{j} 's equal to kk and n=(n0,n1,)\mathbf{n}=\left(n_{0}, n_{1}, \ldots\right) is a sufficient statistic, such that π(ϵ,λn)=π(ϵ,λx)\pi(\epsilon, \lambda \mid \mathbf{n})=\pi(\epsilon, \lambda \mid \mathbf{x}). Thus, the likelihood (up to proportionality) can be written as L(ϵ,λn){ϵ+(1ϵ)exp(λ)}n0k=16{(1ϵ)λkk!exp(λ)}nk L(\epsilon, \lambda \mid \mathbf{n}) \propto\{\epsilon+(1-\epsilon) \exp (-\lambda)\}^{n_{0}} \prod_{k=1}^{6}\left\{(1-\epsilon) \frac{\lambda^{k}}{k !} \exp (-\lambda)\right\}^{n_{k}} 1. Write* and run an Independent sampler to obtain a sample of size 10000 from the posterior distribution of (ϵ,λ)(\epsilon, \lambda). Assuming a Beta(a,b)\operatorname{Beta}(a, b) proposal for ϵ\epsilon and a Gamma(α,β)\operatorname{Gamma}(\alpha, \beta) proposal for λ\lambda, discuss how you tuned the parameters of these proposals (you may assume π(ϵ,λ)1\pi(\epsilon, \lambda) \propto 1 ). 2. Write* and run a RWM algorithm to obtain a sample of size 10000 from the posterior distribution of (ϵ,λ)(\epsilon, \lambda). Assuming Normal proposals for both ϵ\epsilon and λ\lambda, explain your process of selecting the proposal variances σϵ2\sigma_{\epsilon}^{2} and σλ2\sigma_{\lambda}^{2} (again, you may assume π(ϵ,λ)1\pi(\epsilon, \lambda) \propto 1 ). 3. Report the posterior means and standard deviations of ϵ\epsilon and λ\lambda obtained from both algorithms. [1] 4. Using the concept of the predictive distribution and the samples obtained from the posterior distribution, based on both the Independent sampler and the RWM, write a program to estimate the mean number of widows in a sample of 105 who will have 0,1,,60,1, \ldots, 6 children, respectively. Comment on how the numbers from the two MCMC algorithms compare between them and with the observations in the table above. Is the zero-inflated Poisson model a good fit? -Note: The file CW2 RR code.txt accompanying the coursework should be used as a template for your IS/RWM algorithms.
Solution by Steps
step 1
To sample from the posterior distribution of (ϵ,λ)(\epsilon, \lambda) using an Independent sampler, we first need to define the likelihood function and the prior distributions for ϵ\epsilon and λ\lambda
step 2
The likelihood function is given by: L(ϵ,λn){ϵ+(1ϵ)exp(λ)}n0k=16{(1ϵ)λkk!exp(λ)}nkL(\epsilon, \lambda \mid \mathbf{n}) \propto \{\epsilon+(1-\epsilon) \exp (-\lambda)\}^{n_{0}} \prod_{k=1}^{6}\left\{(1-\epsilon) \frac{\lambda^{k}}{k !} \exp (-\lambda)\right\}^{n_{k}}
step 3
We assume a uniform prior for (ϵ,λ)(\epsilon, \lambda), which is π(ϵ,λ)1\pi(\epsilon, \lambda) \propto 1
step 4
We propose new values for ϵ\epsilon from a Beta(a,b)\operatorname{Beta}(a, b) distribution and for λ\lambda from a Gamma(α,β)\operatorname{Gamma}(\alpha, \beta) distribution
step 5
We accept the new values with probability equal to the ratio of the posterior densities of the proposed values to the current values
step 6
We tune the parameters (a,b)(a, b) of the Beta distribution and (α,β)(\alpha, \beta) of the Gamma distribution to achieve a good acceptance rate, typically between 20% and 50%
step 7
We iterate this process to obtain a sample of size 10000 from the posterior distribution
Answer
The solution involves setting up the likelihood and prior, proposing new values for parameters, calculating acceptance probabilities, and tuning the proposal distributions to achieve a reasonable acceptance rate.
Key Concept
Independent sampler for Bayesian inference
Explanation
The Independent sampler is a Markov Chain Monte Carlo (MCMC) method used to sample from a posterior distribution when direct sampling is difficult. It involves proposing new parameter values from a distribution and accepting them based on the likelihood and prior.
---
Solution by Steps
step 1
To implement the Random Walk Metropolis (RWM) algorithm, we start with initial values for (ϵ,λ)(\epsilon, \lambda)
step 2
We propose new values for (ϵ,λ)(\epsilon, \lambda) by adding a normally distributed random variable with mean 0 and variances σϵ2\sigma_{\epsilon}^{2} and σλ2\sigma_{\lambda}^{2} to the current values
step 3
We calculate the acceptance probability as the ratio of the posterior densities of the proposed values to the current values
step 4
We select the proposal variances σϵ2\sigma_{\epsilon}^{2} and σλ2\sigma_{\lambda}^{2} by trial and error to achieve an acceptance rate between 20% and 50%
step 5
We iterate this process to obtain a sample of size 10000 from the posterior distribution
Answer
The solution involves initializing the parameters, proposing new values using a normal distribution, calculating acceptance probabilities, and tuning the proposal variances to achieve a reasonable acceptance rate.
Key Concept
Random Walk Metropolis algorithm for MCMC
Explanation
The RWM algorithm is an MCMC method that uses a random walk to explore the parameter space. The proposal variances are critical for the efficiency of the algorithm and need to be tuned for a good acceptance rate.
---
Solution by Steps
step 1
To report the posterior means and standard deviations of ϵ\epsilon and λ\lambda, we calculate the sample mean and standard deviation from the samples obtained from both the Independent sampler and the RWM algorithm
step 2
The sample mean is calculated as the average of the samples, and the sample standard deviation is calculated as the square root of the variance of the samples
Answer
The posterior means and standard deviations are calculated using the sample mean and standard deviation formulas on the samples obtained from the MCMC algorithms.
Key Concept
Posterior summary statistics
Explanation
The posterior mean and standard deviation provide a summary of the central tendency and dispersion of the parameter estimates from the posterior distribution.
---
Solution by Steps
step 1
To estimate the mean number of widows with 0,1,,60,1, \ldots, 6 children, we use the predictive distribution, which is based on the posterior samples of (ϵ,λ)(\epsilon, \lambda)
step 2
For each posterior sample, we calculate the predictive probabilities for 0,1,,60,1, \ldots, 6 children using the zero-inflated Poisson model
step 3
We then average these probabilities across all posterior samples to estimate the mean number of widows for each number of children
step 4
We compare these estimates with the observed data and assess the fit of the zero-inflated Poisson model
Answer
The mean number of widows for each number of children is estimated by averaging the predictive probabilities across the posterior samples, and the model fit is assessed by comparing these estimates with the observed data.
Key Concept
Predictive distribution in Bayesian inference
Explanation
The predictive distribution uses the posterior distribution of the parameters to estimate the probabilities of future observations. It is a key concept in Bayesian predictive modeling.
© 2023 AskSia.AI all rights reserved