Question 3
We would like to use direct principal component analysis (PCA) or dual PCA to reduce
the dimension of n, d-dimensional dataset X = [x1, · · · , xn] ∈ R
d×n
. If we define
X =
−
1 1 2 2 4
1 1 1 2 2 .
(a) For direct PCA, what would be the co-variance matrix S of X? Please show your
derivation to compute this matrix numerically (including any intermediate matrices
you calculate). [4 marks]
(b) The eigenvalues and eigenvectors of S are given as follows:
λ1 = 10, u1 =
"
1
√
2
√
1
2
#
and λ2 = 2, u2 =
"
√
1
2
−
1
√
2
#
.
Assume we use direct PCA to project X into Y such that the variance of the projected
dataset Y has the minimum variance. Please work out in this case how to compute Y
numerically. Please present your results using the fraction style (not decimal style).
[4 marks]
(c) Now, please show your derivation regarding how to recover the dataset ˜X from Y
such that it has the same dimension of the original X. Next, please explain (1) why
˜X is not strictly equal to X and (2) how we can recover the exact X from Y? Please
present your results using the fraction style (not decimal style). [4 marks]
(d) In dual PCA, we need to compute a similar square matrix as S. What would be
this matrix? Please show your derivation to compute this matrix numerically. Next,
please prove that in terms of training data projection how dual PCA is derived from
direct PCA using the concept of singular value decomposition. [4 marks]
(e) Now we want to use direct PCA to reduce the dimension of a new dataset of 10
dimensions (i.e., d = 10). Assume that the first five eigenvalues of the co-variance
matrix are respectively λ1 = 10, λ2 = 2, λ3 = 0.15, λ4 = 0.05 and λ5 = 0.02,
– 4 – Turn Over
Non-alpha only
and that the remaining eigenvalues are all zero. Please work out which principal
components we should use to project the dataset such that afterwards we can retain
at least 95% of the variance of this dataset? Please show your numerical derivation
in detail. [4 marks]