Suppose I am performing PCA on 3 standardized variables: height, weight, and income. I understand that each principle component maximizes variance along a new line, but there are two ways I can see this happening, and I am unsure which is accurate:
1) The first principle component considers all three variables and finds the line of greatest variation through the entire three-dimensional data cloud.
vs.
2) The first principle component is calculated to maximize variance in the two dimensional plane between the two variables with greatest covariance.
I suspect the first explanation is true, but given that each covariance value in the the covariance matrix considers no more than 2 variables, I am unsure of my logic.