2

I read a paper regarding air quality. The author mentioned that "The PM10 pollutant also shows the highest of kurtosis value, which indicates that it appears most frequently in unhealthy pollution events."

So I am confused about how to kurtosis value can be used to explain the healthy and unhealthy event?

Maryam
  • 1,012
  • 1
  • 5
  • 20
  • 2
    High 'kurtosis' measures the fat-tails, i.e., outliers. There must be a baseline distribution that represent healthy levels for an aerosol/pollutant. – msuzen Jun 24 '21 at 13:38

1 Answers1

5

It would be good to see more context here, but it seems to me that they are trying to make some sort of argument using the long debunked, incorrect "more concentration toward the mean" interpretation of greater kurtosis.

Larger kurtosis does not imply greater concentration toward the mean; here is a counterexample showing a family of distributions where kurtosis increases but concentration toward the mean stays constant.

There is only one solid, unassailable implication of higher kurtosis that I am aware of. It is pretty simple, and goes as follows:

Suppose you have two data sets, and the second one has higher kurtosis. What does it tell you about the comparison of the two data sets? Simple. Just compute the $z_i^4$ values for the data set with higher kurtosis, where the $z_i$ are the standardized values ($z_i = (x_i - \bar x)/s_x$). Now compute $z_i^4$, $i = 1,\dots ,n$, and draw the dot plot (like a histogram but with dots) of these values. Place a fulcrum on the horizontal axis, located at the kurtosis of the data set that has smaller kurtosis.

Now, since the kurtosis is the average of the $z_i^4$ values, the dot plot balances at the larger kurtosis; hence, it "falls to the right" of the fulcrum located by the smaller kurtosis.

Now what causes it to "fall to the right"? Is it greater concentration toward the mean of the data with higher kurtosis? No, that would make it "fall to the left," because "concentration toward the mean" corresponds to $z_i$ near 0. Is it greater "peakedness" of the distribution with higher kurtosis? No, again, that would arguably make it "fall to the left."

So, greater kurtosis corresponds precisely to greater tail weight (or more precisely, tail leverage), which is turn is caused by tail extension, or extreme values, as measured by the $z$ values taken to the fourth power.

As Mehmet indicated in the comment, kurtosis is thus a good indicator of outliers.

But careful here as well: There are sources popping up on the web saying that higher kurtosis indicates "more outliers." This is not true: If you add the same extreme data value repeatedly to a data set, you actually decrease kurtosis. The reason is that outliers are rare and extreme values. But if you have a lot of the same data value, the observation is no longer as rare or as extreme.

So instead, higher kurtosis more specifically implies greater tail leverage, as described above. A single, more extreme outlier is enough to increase kurtosis.

BigBendRegion
  • 4,593
  • 12
  • 22