AskSia

Plus

Coursework 2: Independent Sampler and RWM To be submitted online via Blackboa...

Mar 19, 2024

Coursework 2: Independent Sampler and RWM To be submitted online via Blackboard Turnitin by 10am, Thursday 21 March. Submitted work should be a pdf file containing your full solutions, including

\mathrm{R}

programs and code to run the

\mathrm{R}

programs, additional text and figures, and it should be presented as a concise report. This assignment counts for

12.5 \%

of the final mark for the course and should take, on average, 10 hours to complete. The number of children of widowed women in Manchester, in 2023, is provided in the table below: \begin{tabular}{l|c|c|c|c|c|c|c} No. of children & 0 & 1 & 2 & 3 & 4 & 5 & 6 \\ \hline Observed no. of widows & 52 & 10 & 25 & 8 & 7 & 3 & 0 \end{tabular} For many data sets consisting of count data, there are more zeros in the data than can be accounted for by a Poisson model. Therefore a zero-inflated Poisson model is often used. Let

X

be the number of children per widow; then the data

x_{1}, x_{2}, \ldots, x_{n}

is assumed to be iid realisations from

X \stackrel{D}{=}\left\{\begin{array}{ll} \operatorname{Po}(\lambda) &amp; \text { with probability } 1-\epsilon \\ 0 &amp; \text { with probability } \epsilon \end{array}\right.

where

(\lambda, \epsilon)

are parameters to be estimated. This is an example of a mixture distribution, where

X

is distributed either according to distribution

A \sim \operatorname{Po}(\lambda)

or distribution

B \equiv 0

. Therefore

P(X=0)=\epsilon+(1-\epsilon) \exp (-\lambda)

and for

k=1,2, \ldots

P(X=k)=(1-\epsilon) \times \frac{\lambda^{k}}{k !} \exp (-\lambda)

. For

k=0,1, \ldots

, let

n_{k}=\sum_{j=1}^{n} 1_{\left\{x_{j}=k\right\}}

denote the total number of

x_{j}

's equal to

k

and

\mathbf{n}=\left(n_{0}, n_{1}, \ldots\right)

is a sufficient statistic, such that

\pi(\epsilon, \lambda \mid \mathbf{n})=\pi(\epsilon, \lambda \mid \mathbf{x})

. Thus, the likelihood (up to proportionality) can be written as

L(\epsilon, \lambda \mid \mathbf{n}) \propto\{\epsilon+(1-\epsilon) \exp (-\lambda)\}^{n_{0}} \prod_{k=1}^{6}\left\{(1-\epsilon) \frac{\lambda^{k}}{k !} \exp (-\lambda)\right\}^{n_{k}}

1. Write* and run an Independent sampler to obtain a sample of size 10000 from the posterior distribution of

(\epsilon, \lambda)

. Assuming a

\operatorname{Beta}(a, b)

proposal for

\epsilon

and a

\operatorname{Gamma}(\alpha, \beta)

proposal for

\lambda

, discuss how you tuned the parameters of these proposals (you may assume

\pi(\epsilon, \lambda) \propto 1

). 2. Write* and run a RWM algorithm to obtain a sample of size 10000 from the posterior distribution of

(\epsilon, \lambda)

. Assuming Normal proposals for both

\epsilon

and

\lambda

, explain your process of selecting the proposal variances

\sigma_{\epsilon}^{2}

and

\sigma_{\lambda}^{2}

(again, you may assume

\pi(\epsilon, \lambda) \propto 1

). 3. Report the posterior means and standard deviations of

\epsilon

and

\lambda

obtained from both algorithms. [1] 4. Using the concept of the predictive distribution and the samples obtained from the posterior distribution, based on both the Independent sampler and the RWM, write a program to estimate the mean number of widows in a sample of 105 who will have

0,1, \ldots, 6

children, respectively. Comment on how the numbers from the two MCMC algorithms compare between them and with the observations in the table above. Is the zero-inflated Poisson model a good fit? -Note: The file CW2

R

code.txt accompanying the coursework should be used as a template for your IS/RWM algorithms.

Solution by Steps

step 1

To sample from the posterior distribution of

(\epsilon, \lambda)

using an Independent sampler, we first need to define the likelihood function and the prior distributions for

\epsilon

and

\lambda

step 2

The likelihood function is given by:

L(\epsilon, \lambda \mid \mathbf{n}) \propto \{\epsilon+(1-\epsilon) \exp (-\lambda)\}^{n_{0}} \prod_{k=1}^{6}\left\{(1-\epsilon) \frac{\lambda^{k}}{k !} \exp (-\lambda)\right\}^{n_{k}}

step 3

We assume a uniform prior for

(\epsilon, \lambda)

, which is

\pi(\epsilon, \lambda) \propto 1

step 4

We propose new values for

\epsilon

from a

\operatorname{Beta}(a, b)

distribution and for

\lambda

from a

\operatorname{Gamma}(\alpha, \beta)

distribution

step 5

We accept the new values with probability equal to the ratio of the posterior densities of the proposed values to the current values

step 6

We tune the parameters

(a, b)

of the Beta distribution and

(\alpha, \beta)

of the Gamma distribution to achieve a good acceptance rate, typically between 20% and 50%

step 7

We iterate this process to obtain a sample of size 10000 from the posterior distribution

Answer

The solution involves setting up the likelihood and prior, proposing new values for parameters, calculating acceptance probabilities, and tuning the proposal distributions to achieve a reasonable acceptance rate.

Key Concept

Independent sampler for Bayesian inference

Explanation

The Independent sampler is a Markov Chain Monte Carlo (MCMC) method used to sample from a posterior distribution when direct sampling is difficult. It involves proposing new parameter values from a distribution and accepting them based on the likelihood and prior.

---

Solution by Steps

step 1

To implement the Random Walk Metropolis (RWM) algorithm, we start with initial values for

(\epsilon, \lambda)

step 2

We propose new values for

(\epsilon, \lambda)

by adding a normally distributed random variable with mean 0 and variances

\sigma_{\epsilon}^{2}

and

\sigma_{\lambda}^{2}

to the current values

step 3

We calculate the acceptance probability as the ratio of the posterior densities of the proposed values to the current values

step 4

We select the proposal variances

\sigma_{\epsilon}^{2}

and

\sigma_{\lambda}^{2}

by trial and error to achieve an acceptance rate between 20% and 50%

step 5

We iterate this process to obtain a sample of size 10000 from the posterior distribution

Answer

The solution involves initializing the parameters, proposing new values using a normal distribution, calculating acceptance probabilities, and tuning the proposal variances to achieve a reasonable acceptance rate.

Key Concept

Random Walk Metropolis algorithm for MCMC

Explanation

The RWM algorithm is an MCMC method that uses a random walk to explore the parameter space. The proposal variances are critical for the efficiency of the algorithm and need to be tuned for a good acceptance rate.

---

Solution by Steps

step 1

To report the posterior means and standard deviations of

\epsilon

and

\lambda

, we calculate the sample mean and standard deviation from the samples obtained from both the Independent sampler and the RWM algorithm

step 2

The sample mean is calculated as the average of the samples, and the sample standard deviation is calculated as the square root of the variance of the samples

Answer

The posterior means and standard deviations are calculated using the sample mean and standard deviation formulas on the samples obtained from the MCMC algorithms.

Key Concept

Posterior summary statistics

Explanation

The posterior mean and standard deviation provide a summary of the central tendency and dispersion of the parameter estimates from the posterior distribution.

---

Solution by Steps

step 1

To estimate the mean number of widows with

0,1, \ldots, 6

children, we use the predictive distribution, which is based on the posterior samples of

(\epsilon, \lambda)

step 2

For each posterior sample, we calculate the predictive probabilities for

0,1, \ldots, 6

children using the zero-inflated Poisson model

step 3

We then average these probabilities across all posterior samples to estimate the mean number of widows for each number of children

step 4

We compare these estimates with the observed data and assess the fit of the zero-inflated Poisson model

Answer

The mean number of widows for each number of children is estimated by averaging the predictive probabilities across the posterior samples, and the model fit is assessed by comparing these estimates with the observed data.

Key Concept

Predictive distribution in Bayesian inference

Explanation

The predictive distribution uses the posterior distribution of the parameters to estimate the probabilities of future observations. It is a key concept in Bayesian predictive modeling.