1

I think most people leave high school with a very limited understanding of basic statistical concepts. I think part of this is due to the intimidating looking formulas for things like standard error (many sigmas/indexes etc.).

I've always thought it would make much more sense to teach mean absolute deviation when introducing students to quantifying dispersion. It turns out that I don't think I'm alone, this education researcher seems to feel the same way:

http://sru.soc.surrey.ac.uk/SRU64.pdf http://dro.dur.ac.uk/12187/

This is easy enough, but it leads to the next obvious question: When a elementary school student is running a simple experiment and comparing means between groups, is there a simpler way the student can quantify the fact that a random sample would (not) likely show the difference they observed? The p-value would be too complicated to teach the concepts behind and be computationally opaque. I'd like to use Cohen's d, but I can't point to a standard set of effect sizes when using mean absolute deviation so the student can say their differences are "small" or "large".

The idea is that I want the student to see that, say, the differences between their groups may appear small or large; but that argument should be reinforced with a numerical procedure. Maybe something like comparing the difference between groups to the largest difference within a group? Does anyone have any good ideas?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • This is a very broad and subjective question. Not so easy for the question and answer format here. Anyway what do you think about teaching Bayes Theorem? Or teaching simple laws of probability (not the reverse stuff) like estimating probabilities in card games. This is how it started 200-300 years ago. The binomial distributions and the application by John Arbuthnot in expressing p-values might be a simple expression. See for more links on history and inspiration of "simple" (original/key/fundamental) problems: https://stats.stackexchange.com/questions/343268/who-first-used-invented-p-values – Sextus Empiricus Apr 01 '19 at 23:41
  • I think it is fairly specific, let me try again: I think the mean absolute deviation is a good way to introduce quantitation of dispersion vs. variance. Is there an established way to use mean absolute deviation with other appropriate approximations/heuristics to approximate the p-value (or something like it) in a way that is not opaque in the context of an elementary school student doing an experiment? – mittimithai Apr 01 '19 at 23:57
  • Is that "Is there an established way to use mean absolute deviation with other appropriate approximations/heuristics to approximate the p-value" your specific question? – Sextus Empiricus Apr 01 '19 at 23:58
  • Monte Carlo simulations can approximate a p-value. But is it an answer? (that is what I mean by broad and subjective) – Sextus Empiricus Apr 02 '19 at 00:35
  • Yes that is the specific question. Monte Carlo simulation isn't a bad idea but perhaps the implementation would be opaque and require an understanding of distribution. I suppose that really just leaves the question to be refined to: Can Cohen's d be used with mean absolute deviation instead of standard deviation and, if so, is there a way legitimately compare one's effect sizes to the standard scale of effect sizes (0.2 = small, 0.8 = big)? – mittimithai Apr 02 '19 at 00:47
  • Is the question whether it *can* be done or whether it *is* simpler to kids? Of course, it can be done. – Sextus Empiricus Apr 02 '19 at 01:01
  • I am certain that something like that (that could be done on a spreadsheet) would be simpler to an elementary school child (who understands the notion of 'mean'). The question is, is that valid? It is certainly intuitive (it's still just dividing the difference in means by some combination of the dispersions), but it isn't clear to me how the student can say "I found a d of 0.23, which is considered a small effect in statistics" legitimately this way if they are not using standard error for their measure of dispersion. – mittimithai Apr 02 '19 at 01:09
  • It is valid. When the distribution is a normal distribution then the mean absolute deviation (scaled by $\sqrt{\pi/2}$) is an estimator for $\sigma$. (this alternative cohen's d will not be following a t-distribution, but instead something more exotic, but for large sample sizes it should be well approximated by a z-distribution). Which is better to use is a hot debate. For a start, you can look at ([Fisher 1920, A mathematical examination of the methods of determining accuracy of observation by the mean error and by the mean square error](http://adsabs.harvard.edu/abs/1920MNRAS..80..758F)) – Sextus Empiricus Apr 02 '19 at 01:33
  • Ok. Is there a correspondingly simpler estimator for the pooled variance obtained from the mean absolute deviation? If n's are equal, can one use the mean of the two variances? – mittimithai Apr 02 '19 at 02:52
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/91846/discussion-between-mittimithai-and-martijn-weterings). – mittimithai Apr 02 '19 at 04:40

1 Answers1

1

If you have a whole class of elementary school students, you could introduce the concept of quantifying randomness of experiments on a more practical level. Say there's 25 students, and one of them has a lopsided coin, while the other 24 have fair coins. Everyone flips their coin $n=$ 5 (10, 15, 20, ...) times and writes down the number of heads. Collecting the results of the 25 students gives you a crude histogram of number of heads, where the one result of the lopsided coin should be among the outlying values, but with low $n$ there's actually the chance that some of the fair coins have the same or even more extreme numbers of heads, while this rate diminishes with increasing $n$.

Of course, this way you would have introduced the $p$-value empirically.

Edgar
  • 1,391
  • 2
  • 7
  • 25