I would like to use PCA as a method of anomaly detection, however I'm wondering how this is done exactly (I'm using prcomp
in R).
I'm really questioning the approach not the R code itself. Am I right in thinking I first run PCA on a bunch of data to find the lower dimensional subspace representation using the first $k$ PCs. Then as NEW data becomes available I reconstruct it using the $k$ PCs then examine the error. So if the error blows up I know the new data sample doesn't have the same 'structure' compared with the data used to build the PCs... and therefore it's different somehow... i.e. an anomaly.
Can someone tell me if I'm in the right ballpark with my assumption?