If I have a dataset with $n$ observations and $p$ variables (dimensions), and generally $n$ is small ($n=12-16$), and $p$ may range from small ($p = 4-10$) to perhaps much larger ($p= 30-50$).
I remember learning that $n$ should be much larger than $p$ in order to run principal component analysis (PCA) or factor analysis (FA), but it seems like this may not be so in my data. Note that for my purposes I am rarely interested in any principal components past PC2.
Questions:
- What are the rules of thumb for minimum sample size when PCA is OK to use, and when it is not?
- Is it ever OK to use the first few PCs even if $n=p$ or $n<p$?
- Are there any references on this?
Does it matter if your main goal is to use PC1 and possibly PC2 either:
- simply graphically, or
- as synthetic variable then used in regression?