I have a matrix where the rows are the data points (samples) and the columns are the features (predictors). Let's say I have 1000 data points and 20 features, i.e. the matrix is of size 1000 x 20.
Now I want to detect and possibly remove outliers. I have read a good introduction: http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm
One possibility is for example to use the modified Z-score and remove everything with a value of above 3.5.
First, how should I apply this? Should I just calculate the modified Z-score for each row (data point) of the matrix and removing those rows which are flagged as outliers or should I calculate it for each column (feature)? The same problem I have with making plots (e.g. histograms)...
Second, which outlier detection method is best (possibly also for not normal distributed data)? There are so many. Simply methods like modified Z-score or just looking at the standard deviation seems to be often used.