AskSia

Plus

Instructions: For questions 1~4, please attach/copy your R code at the end of ...

Oct 15, 2024

Instructions: For questions 1~4, please attach/copy your R code at the end of your answer for each question. R script worth 40% of the mark. Please be concise and only include what I am asking for. File type should be either WORD or PDF. Word limit: 2000 words. Late penalty: 5% of the total marks will be deducted for each day past the due date. Work submitted after 5th day (i.e., 120 hours past the due date) will normally receive a mark 0. Question 1: Please import the dataset “wild.csv”. Suppose you are asked to test whether the response ratio in the arm of Panitumumab and FOLFOX is different from the one in the arm of FOLFOX alone. (a) Please state the null and alternative hypotheses (5 points) (b) Please calculate the test statistic and the p-value of your test (5 points; Need to specify the formula components) (c) How will you estimate the response ratio for each arm? Calculate the Fisher information for it and verify whether variance of your estimate has achieved the lower bound. (10 points) ， Question 2: In the “wild” dataset, investigators are curious about whether Panitumumab and FOLFOX can prolong patients’ progression-free survival (PFS, PFSDYCR) compared to FOLFOX alone, assuming that there is no censoring. (Hint: In this question, PFS should be treated as a common one-dimensional random variable.) (a) Please state the null and alternative hypotheses (5 points) (b) Please calculate the test statistics and the p-value of your test (5 points) (c) Assume PFS in both arms follows an exponential distribution and one is interested in estimating their medians. What would be your estimates? Is your estimate a MVUE? (10 points) Question 3: Still based on the “wild” dataset. Investigators are also curious about whether Panitumumab and FOLFOX can prolong patients’ overall survival (OS, DTHDY) compared to FOLFOX alone, assuming that there is no censoring. (Hint: In this question, OS should be treated as a common one-dimensional random variable.) (a) Please state the null and alternative hypothesis. (5 points) (b) Conduct the testing and draw the conclusion. (5 points). (c) If the test on PFS is primary and the one on OS is secondary, how would you conduct the tests and what is your conclusion? (10 points) Question 4: Evaluations of the required sample size for the “wild” dataset with a 1:1 randomization ratio. (a) Based on the results on the response rate from Q1, how many subjects need to be enrolled to achieve a power at the level of 0.9? (5 points) (b) Based on the results on PFS from Q2, how many subjects need to be enrolled to achieve a power at the level of 0.9? (5 points) (c) If one desires simultaneous successes in both PFS and OS, how many subjects need to be enrolled to achieve a power at the level of 0.9? (5 points) (d) Could you explain why a most powerful test, instead of any test, is needed for sample size calculations? (5 points)

Solution by Steps

Question 1:

step 1

The null hypothesis

H_0

states that the response ratio in the arm of Panitumumab and FOLFOX is equal to the response ratio in the arm of FOLFOX alone, i.e.,

H_0: \frac{p_1}{p_2} = 1

. The alternative hypothesis

H_a

states that the response ratios are different, i.e.,

H_a: \frac{p_1}{p_2} \neq 1

step 2

To calculate the test statistic, we can use the formula for the test statistic for proportions:

z = \frac{\hat{p_1} - \hat{p_2}}{\sqrt{ \hat{p}(1 - \hat{p}) \left( \frac{1}{n_1} + \frac{1}{n_2} \right) }}

where

\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}

. Here,

x_1

and

x_2

are the number of successes in each group, and

n_1

and

n_2

are the sample sizes

step 3

The p-value can be calculated using the standard normal distribution. If

z

is the calculated test statistic, the p-value is given by: p\text{-value} = 2 \cdot P(Z > |z|) where

Z

is a standard normal variable

step 4

To estimate the response ratio for each arm, we can use the sample proportions

\hat{p_1}

and

\hat{p_2}

. The Fisher information for the response ratio can be calculated as:

I(\theta) = \frac{1}{\hat{p_1}(1 - \hat{p_1})} + \frac{1}{\hat{p_2}(1 - \hat{p_2})}

We verify if the variance of the estimate achieves the lower bound by checking if the Cramér-Rao lower bound is met

Answer

The null hypothesis is that the response ratios are equal, while the alternative is that they are different. The test statistic and p-value can be calculated using the provided formulas. The Fisher information can be computed to check the variance.

--- Question 2:

step 1

The null hypothesis

H_0

states that the median PFS for patients receiving Panitumumab and FOLFOX is equal to that for patients receiving FOLFOX alone, i.e.,

H_0: \text{Median}_{PFS1} = \text{Median}_{PFS2}

. The alternative hypothesis

H_a

states that the medians are different, i.e.,

H_a: \text{Median}_{PFS1} \neq \text{Median}_{PFS2}

step 2

To calculate the test statistic, we can use the Mann-Whitney U test, which does not assume normality. The test statistic

U

can be calculated as:

U = R_1 - \frac{n_1(n_1 + 1)}{2}

where

R_1

is the sum of ranks for group 1 and

n_1

is the sample size of group 1. The p-value can be derived from the U distribution

step 3

Assuming PFS follows an exponential distribution, the median can be estimated as

\text{Median} = \frac{\ln(2)}{\lambda}

, where

\lambda

is the rate parameter. The maximum likelihood estimator (MLE) for

\lambda

\hat{\lambda} = \frac{1}{\bar{x}}

, where

\bar{x}

is the sample mean. The MVUE property can be checked by verifying if the estimator is unbiased and has the minimum variance

Answer

The null hypothesis states that the medians of PFS are equal, while the alternative states they are different. The test statistic can be calculated using the Mann-Whitney U test, and the median estimates can be derived from the exponential distribution.

--- Question 3:

step 1

The null hypothesis

H_0

states that the median OS for patients receiving Panitumumab and FOLFOX is equal to that for patients receiving FOLFOX alone, i.e.,

H_0: \text{Median}_{OS1} = \text{Median}_{OS2}

. The alternative hypothesis

H_a

states that the medians are different, i.e.,

H_a: \text{Median}_{OS1} \neq \text{Median}_{OS2}

step 2

To conduct the testing, we can again use the Mann-Whitney U test or a log-rank test if we are dealing with survival data. The test statistic

U

can be calculated as in Question 2, and the p-value can be derived accordingly

step 3

If the test on PFS is primary and the one on OS is secondary, we would conduct the PFS test first. If significant, we would report that result. For the OS test, we would interpret the results in the context of the PFS findings, considering the potential for type I error

Answer

The null hypothesis states that the medians of OS are equal, while the alternative states they are different. The testing can be conducted using the Mann-Whitney U test or log-rank test, and the conclusions should consider the primary and secondary nature of the tests.

--- Question 4:

step 1

To determine the required sample size for achieving a power of 0.9 based on the response rate from Question 1, we can use the formula:

n = \left( \frac{(z_{\alpha/2} + z_{\beta})^2 \cdot (p_1(1 - p_1) + p_2(1 - p_2))}{(p_1 - p_2)^2} \right)

where

z_{\alpha/2}

is the critical value for the significance level and

z_{\beta}

is the critical value for the desired power

step 2

Similarly, for PFS from Question 2, we can use the same sample size formula, substituting the appropriate parameters for the exponential distribution

step 3

For simultaneous successes in both PFS and OS, we would use the Bonferroni correction to adjust the significance level and recalculate the sample size accordingly

step 4

A most powerful test is needed for sample size calculations to ensure that the test has the highest probability of correctly rejecting the null hypothesis when it is false, thus maximizing the study's efficiency

Answer

The required sample sizes can be calculated using the provided formulas for both response rates and PFS. For simultaneous successes, adjustments must be made for the significance level. A most powerful test is crucial for efficient sample size calculations.