General practice for clustering is to do some sort of linear/non-linear dimensionality reduction before clustering esp. if the number the number of features are high(say n
). In case of linear dimensionality reduction technique like PCA, the objective is to find principal orthogonal components(say m
) that can explain most variance in the data such that m<<n
when n
is high.
But for non-linear dimensionality reduction techniques like auto-encoders, can the reduced dimensions, itself be clusters that indicate different modes of operation example for industrial components. Am I missing something here or is my understanding of non-linear dimensionally reduction wrong? Any help is appreciated.
This question might be too basic for some, so please don't be extremely critical of the question if you don't want to answer it.
@fk128 shared his interpretation of my question that might be better understood and easy to interpret than what I have mentioned above