1

I am dealing with empirical data, integer- and continuous-valued, with a lower bound (at zero) that are often positively skewed, and seem to be following either the Poisson, $\chi^2$, binomial, or beta-binomial distributions.

From my experience with the data, I can tell that the more positive skewness is associated with lower mean and lower variance. For the above-mentioned distributions and using their mean, variance, and skewness formulas, my intuition is valid—put differently, lower mean comes with lower variability and higher positive skewness.

I know that in the upper-bounded distributions like (beta-)binomial, if the mean is larger than the midpoint (where skewness becomes negative) and the mean-variance relationship is reversed (i.e., the higher mean, the lower the variance). What I am interested in is when skewness is positive.

Now I want to make a claim that my intuition is generally true. However, looking into some less conventional distributions, I noted that there are exceptions to my intuition. For instance, the skewnesses of the Maxwell–Boltzmann distribution and the Irwin–Hall distribution are constant and independent of their parameter (thus unrelated to mean or variance). Moreover, there are other distributions whose mean, variance, and skewness are defined by exactly three parameters, and I suspect they would also go against my intuition.

Having these said, is there a set of conditions (inferred from moments or something) under which my intuition will always be correct?

Thanks in advance!

ManuHaq
  • 159
  • 9
  • 1
    Focusing on commonly studied distributions tends to be counterproductive, because it confuses mathematical tractability and familiarity with the underlying concepts or phenomena. I suspect that the phenomenon you are trying to get at is explained fairly generally in my answer at https://stats.stackexchange.com/a/66038/919. It certainly provides one kind of answer to your question, depending on what you are actually referring to by your "intuition." It covers both positively skewed and negatively skewed situations, btw. – whuber Feb 06 '22 at 16:23
  • Thanks, @whuber. Your first point is fair. And your other answer was such a great explainer! It answers the level(mean)-to-spread(variance) side of my question, though cannot see how it may address the spread(variance)-to-assymetry(skewness) [or level(mean)-to-assymetry(skewness)] part of it. (Also, what I meant by intuition was merely the pattern I could induce based on the time series data that I'm working with, which also was in line with visual comparison of the curves of abovementioned distributions for different parameter sets.) – ManuHaq Feb 07 '22 at 10:08
  • 1
    It addresses skewness insofar as applying any Box-Cox transformation (apart from the identity) changes skewness. One can go further by studying the spreads vs. the levels of the N-letter summary of univariate data to estimate a Box-Cox parameter that will minimize skewness. – whuber Feb 07 '22 at 14:47
  • Sure, the transformation affects skewness, though how can we tell the monotonicity/direction between skewness and level based on that? I looked up the N-letter summary approach and based on [this](https://mgimond.github.io/ES218/Week08b.html) and [your answer here](https://stats.stackexchange.com/a/96684/142674). From what I gather, it can be used to visually examine skewness, though the number of letter value summaries cannot be determined unless with a (dataset-specific) stopping rule. (Am I getting it correctly?) How to account for that when following the analysis in your other answer? – ManuHaq Feb 08 '22 at 06:45
  • 1
    Your perceptions are incorrect: there are *quantitative* ways to examine an N-letter summary to assess skewness. (I was alluding to one in my previous comment: you plot squares of differences between matched summaries--high and low--against their midranges.) For details, see Tukey's book *EDA.* – whuber Feb 08 '22 at 15:17
  • Hmmm, I see. I am going to read it in the book then. Thanks :). – ManuHaq Feb 09 '22 at 08:09

0 Answers0