Intuition of applying PCA before logistic regression

Question

I came across this paragraph about logistic regression with PCA in Kevin P Murphy's book on Machine Learning.

If we use PCA first, then use logistic regression afterwards, although overall, this is still representable as a logistic regression problem, the problem is constrained since we have forced linear regression to work in the subspace spanned by the PCA vectors. Consider 100 training vectors randomly positioned in a 1000 dimensional space each with a random class 0 or 1. With very high probability, these 100 vectors will be linearly separable. Now project these vectors onto a 10 dimensional space: with very high probability, 100 vectors plotted in a 10 dimensional space will not be linearly separable. Hence, arguably, we should not use PCA first since we could potentially transform a linearly separable problem into a non-linearly separable problem.

a)Please explain how to understand/visualize that "100 vectors randomly postitioned in 1000D space will be linearly separable vs 100 vectors plotted in 10D space will be non linearly separable".
b) How to apply this intuition to other problems, if applicable?

I was actually recently looking into trying to do this for image classification with MNIST. I don't see why this would make the logistic regression worse.....http://scikit-learn.org/stable/auto_examples/plot_digits_pipe.html — guy, Nov 30 '17 at 16:14
As per the paragraph, what I understood is PCA can be applicable to logistic regression until the dimensionally reduced data is linearly separable, once it hits a point where it is non linearly separable, the complexity of classification increases since a kernel will have to employed to handle non linearly separable data. Please correct me if I am wrong. — MrKickass, Nov 30 '17 at 16:20
That sounds about right but how would you know the correct number of principal components to use so that the data is still linearly separable to some degree? — guy, Nov 30 '17 at 16:24
LOgistic regression is **not** classification: https://stats.stackexchange.com/questions/127042/why-isnt-logistic-regression-called-logistic-classification — kjetil b halvorsen, Feb 21 '18 at 23:04
I would guess the lasso (or elasticnet) is a better idea than PCA — kjetil b halvorsen, Feb 21 '18 at 23:05

Intuition of applying PCA before logistic regression

0 Answers0