I have a set of features with the following covariance matrix:
feature_1 feature_2 feature_3
feature_1 3347 -57 -17
feature_2 -57 2 0.4
feature_3 -17 0.4 53
Each feature is on different scale, thats why the variances magnitude differ that much.
Now, the total "variability" is 3347 + 2.4 + 53 = 3402
According to this post can I say that "feature_1
explains 98% (3347 / 3402) of the total variability" ?
I think this would be unfair, because that feature has that big of a variance only because its scale.
So my question is, would it make more sense to first scale the data (using e.g. the MinMaxScaler) before calculating the covariance matrix ?