Long story short I have a collection of about 30 scripts that manipulate data sets and place them in a database. These scripts report their running times as well as any errors that occur to a separate database. I wrote another script that goes through this database daily and for each script determines if an error occurred. It also checks the running time of each script 30 days back and averages them.
I take the running time of the current script and see if it greater than 3 standard deviations than the average. If it is I report the running time is too far from average.
Is this the correct method for performing such a task? I feel as if I get far too many "running time too far from average" errors. Would increasing the sample size help, or does the 3 standard deviations rule not apply? I was under the assumption 99% of data lies within 3 standard deviations and a reliable way to detect outliers (a script that took a "hella long time" to run) would be to use this method.