Examples (and how to generate them) of various conceptually different datasets to throw at PCA to gain better intuition for it

Question

I've developed a solid understanding of principal component analysis to the point where I can actually write my own implementation of it in python. I fear this is the easy part where the hard part is rigorously developing an intuition for what the results will be for running PCA on a dataset and in determining whether PCA is the best method to use on my dataset.

Because the textbooks really dont go beyond defining what PCA is – they stop before telling me how to really use it, I'm looking for some advice into how I can play around with PCA to create a deeper appreciation for all its nuances.

I'm thinking I can test it on data with various covariance characteristics, i.e. random dataset, datasets with linear relationships among variables etc., but I'm not sure how to make these vague ideas concrete. Can you provide some examples of what kinds of datasets I can throw at PCA and how I can analyze the results to determine why they are what they are?

I'll argue that's its a bit different than that thread (which, of course, is required reading). Here the OP would like examples of datasets that have different conceptual features to self discover some intuition about PCA. — Matthew Drury, Mar 29 '16 at 02:02
@MatthewDrury Requests for data are off-topic, but the component about what PCA means is covered thoroughly by the other thread. — Sycorax, Mar 29 '16 at 03:38

Examples (and how to generate them) of various conceptually different datasets to throw at PCA to gain better intuition for it

0 Answers0