Can the Welch t-test be used with the median and MAD instead of mean and variance? I think outliers are causing problems and the median places less weight on extreme outliers.
I'm using this to test if one piece of code is faster than another. I think some of the outliers don't represent a true measurement but some kind of pausing of the process.