Apologies if this has been asked before, nothing turned up when I tried to search.
I'm noticing some very interesting behavior when I try to do PCA on pairs of some dummy datasets I just invented, which are permutations of a fixed set (here just the range from 1 to 10.) In R:
x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
y <- c(10, 2, 1, 5, 4, 3, 9, 8, 7, 6)
z <- c(8, 3, 2, 1, 4, 7, 9, 6, 5, 10)
df1 <- data.frame(x, y)
df2 <- data.frame(x, z)
df3 <- data.frame(y, z)
I then use prcomp:
> prcomp(df1)
Standard deviations (1, .., p=2):
[1] 3.415650 2.581989
Rotation (n x k) = (2 x 2):
PC1 PC2
x 0.7071068 -0.7071068
y 0.7071068 0.7071068
> prcomp(df2)
Standard deviations (1, .., p=2):
[1] 3.681787 2.185813
Rotation (n x k) = (2 x 2):
PC1 PC2
x 0.7071068 -0.7071068
z 0.7071068 0.7071068
> prcomp(df3)
Standard deviations (1, .., p=2):
[1] 3.858612 1.855921
Rotation (n x k) = (2 x 2):
PC1 PC2
y -0.7071068 -0.7071068
z -0.7071068 0.7071068
So, each component of each principal component is either $\frac{\sqrt{2}}{2}$ or $-\frac{\sqrt{2}}{2}$. I'm not sure exactly why this would be, although it makes a certain kind of sense: both variables 'contain the same data' in a sense, and if we didn't see this behavior, we would be 'preferring' one variable over the other.
That's a very high-level and handwavey view of things, though. Also, if I try more than two variables at a time, this behavior disappears:
>prcomp(data.frame(x, y, z))
Standard deviations (1, .., p=3):
[1] 4.208109 2.603247 1.736355
Rotation (n x k) = (3 x 3):
PC1 PC2 PC3
x 0.5003708 0.8107791 -0.3037538
y 0.5781928 -0.5740476 -0.5797951
z 0.6444549 -0.1144843 0.7560233
Can someone give me some insight into what's going on here?