6

As Kernel PCA is the same as PCA in higher dimension space, shouldn't the eigenvectors obtained be orthogonal?

Suppose, I have $n$ data points and let $a$ and $b$ be two eigenvectors of covariance matrix of mapped data, and $\alpha \in \mathbb{R}^n$ and $\beta \in \mathbb{R}^n$ be the corresponding eigenvectors of Kernel PCA problem (obtained by doing eigenanalysis of kernel matrix $K$). Then $a$ and $b$ can be written as linear combination of input data, i.e. $$a = \sum_{i=1}^n \alpha_i \phi(x_i)$$ and $$b = \sum_{j=1}^n \beta_j \phi(x_j).$$

Their inner product is: $$\langle a,b \rangle \,=\, \left\langle \sum_{i=1}^n \alpha_i \phi(x_i) ,\sum_{j=1}^n \beta_j \phi(x_j)\right\rangle \, = \, \sum_{i=1}^n\sum_{j=1}^n \alpha_i \beta_j K_{ij}.$$

Why is this value zero?

amoeba
  • 93,463
  • 28
  • 275
  • 317
user2329744
  • 183
  • 4

1 Answers1

6

Yes, they are orthogonal. To see that your last expression equals zero, write it in the vector notation: $$\sum_{i,j=1}^n \alpha_i \beta_j K_{ij} = \boldsymbol \alpha^\top \boldsymbol K \boldsymbol \beta=0,$$ because $\boldsymbol \alpha$ and $\boldsymbol \beta$ are two different eigenvectors of $\boldsymbol K$. It is a standard linear algebra result: $$\boldsymbol K \boldsymbol \alpha = \lambda_1 \boldsymbol \alpha,\, \boldsymbol K\boldsymbol \beta = \lambda_2 \boldsymbol \beta \\ \Rightarrow \boldsymbol \beta^\top \boldsymbol K\boldsymbol \alpha = \boldsymbol \beta^\top \lambda_1 \boldsymbol \alpha = (\boldsymbol \beta^\top \lambda_1 \boldsymbol \alpha)^\top = \boldsymbol \alpha^\top \lambda_1 \boldsymbol \beta = \frac{\lambda_1}{\lambda_2}\boldsymbol \alpha^\top \lambda_2 \boldsymbol \beta = \frac{\lambda_1}{\lambda_2}\boldsymbol \alpha^\top \boldsymbol K\boldsymbol \beta = \frac{\lambda_1}{\lambda_2}(\boldsymbol \alpha^\top \boldsymbol K\boldsymbol \beta)^\top = \frac{\lambda_1}{\lambda_2}\boldsymbol \beta^\top \boldsymbol K\boldsymbol \alpha,$$ so if $\lambda_1 \ne \lambda_2$ then $\boldsymbol \beta^\top \boldsymbol K\boldsymbol \alpha=\boldsymbol \alpha^\top \boldsymbol K\boldsymbol \beta=0$.

Note, however, that covariance matrix in the target space often cannot even be defined, because target space can be infinite-dimensional (this is the case e.g. with Gaussian kernel). Meaning that your $\mathbf a$ and $\mathbf b$ can be infinite-dimensional, making the above computations somewhat sloppy... What I think is more important, is that $\boldsymbol \alpha$ and $\boldsymbol \beta$ are orthogonal -- this means that kernel PCs have zero correlation. See my answer here for more details: Is Kernel PCA with linear kernel equivalent to standard PCA?

amoeba
  • 93,463
  • 28
  • 275
  • 317