I have a data set and I performed the following operation:
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X_train)
pca_test = PCA(n_components=10)
pca_test.fit(X_scaled) #------------Using scaled data
cvr = np.cumsum(pca_test.explained_variance_ratio_)
pca_test = PCA(n_components=10)
pca_test.fit(X_train) #------------Using non scaled data
cvr2 = np.cumsum(pca_test.explained_variance_ratio_)
print(np.round(cvr, 2))
[0.31 0.52 0.69 0.81 0.9 0.96 0.98 0.99 1. 1. ]
print(np.round(cvr2, 2))
[0.97 0.99 0.99 1. 1. 1. 1. 1. 1. 1. ]
What does the large explained variance in the two method tell me about data or what I am doing with data rather?
Essentially when I scale the data there is no variance reduction. So, does it mean that in my case it is not the best idea to apply standardization ?