4

I have a dataset where the 5 number summary is:

      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-0.0082630  0.0000000  0.0000000  0.0000059  0.0000000  0.7966557 

I'm not quite sure how to visualize this/whether this is even possible! It is saying that the IQR is 0, while the average value is not 0?

Lucas Farias
  • 1,232
  • 1
  • 8
  • 22
shoestringfries
  • 153
  • 1
  • 6
  • 6
    Suppose you have 100 numbers, 99 of them are zeros and one of them is 100. Then your six statistics would be: 0 0 0 1 0 1 following the order you presented yours. – user158565 May 27 '17 at 18:34
  • 2
    This question is clear enough, I do not understand why somebody wants to close it. – kjetil b halvorsen May 27 '17 at 18:50

4 Answers4

3

The IQR is a measure of variability and the mean is a measure of central tendency. Neither affect the other; you can have distributions centered anywhere with the same variability or centered in the same place with different variabilities. Having an IQR of 0 means there is no variability in the middle 50% of your data, but the center of the distribution can be anywhere.

If you're confused about why the IQR does not contain the mean, remember that the mean depends on the values of all the measurements in your sample, whereas the IQR depends only on the values of the 25th and 75th percentiles. So, something outside the middle 50% of your data can affect the mean and not the IQR. In this case, values greater than the 75th percentile are dragging your mean up to a value greater than the 75th percentile.

Noah
  • 20,638
  • 2
  • 20
  • 58
3

The IQR is a measure of spread and is not related to the mean. You can have any mean with any IQR. It's not clear what relationship you expect there to be.

To see this, start by making a data set with the IQR you want; for example you can get an IQR of 0 by making a lot of values (just over half will do) in the middle of your data set the same, like so:

  1  2  3  5  5  5  5  5  5  5  7  8  9

(13 values, 7 of them equal and in the middle; in this case the mean is 5)

Now add any number you choose to all the values. Say 98. Now the numbers are

 99 100 101 103 103  103  103  103  103  103  105 106 107

The IQR is still 0 but the mean is now 103 (the 5 we started with + the 98 we added).

Glen_b
  • 257,508
  • 32
  • 553
  • 939
1

Your mean does not have to been inside the middle quartiles and you have a lot of zeroes. It does imply that more disclosure is required.

Dave Harris
  • 6,957
  • 13
  • 21
1

Try this in R:

list <- c(-0.0082630, rep(0, 100), rep(0.7966557, 25))
summary(list)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
## -0.008263  0.000000  0.000000  0.158000  0.000000  0.796700

It gives you just about the same results, with an even larger mean. I hope this answers your question.

Heteroskedastic Jim
  • 4,567
  • 1
  • 10
  • 32