1

I've been working on a problem where I'm trying to resample from some data given in the form $x_{-dx}^{+Dx}$, where $x$, $x-dx$ and $x+Dx$ are the median, 16th percentile and 84th percentile. Some of these are too skew to be described by a skew normal, for which $1/1.55 \lesssim dx/Dx \lesssim 1.55$. So I've been wondering: what distributions:

  1. have one shape parameter that allows arbitrary skewness and
  2. become the normal distribution when the shape parameter is zero?

Bonus points if it's implemented in scipy.stats. So far the only distribution I've found that satisfies these two conditions is the generalized normal distribution (version 2), which looks something like a shifted, scaled and maybe flipped log-normal.

I've implemented it for myself as a SciPy distribution. I'd open a request to add it to SciPy but I'm not yet sure it's used widely enough to be warranted.

I searched on here for skew normal distributions but mostly came across questions about estimating the parameters of the "ordinary" skew normal I mentioned above.

Warrick
  • 205
  • 1
  • 7
  • I don't quite understand what you mean by "too skew to be described by a skew normal". The skew normal allows for arbitrary skewness. Your conditions of a zero mean and a normal distribution are met if you use $\xi=0$, $\omega=1$ (or anything else, it's a free parameter) and let $-\infty – Stephan Kolassa Dec 17 '21 at 13:57
  • Since you write that you have 68% "confidence intervals" (I assume you mean symmetric *quantiles*, which is something different than CIs), maybe you are not really interested in having a distribution with a specified *skewness*, but in a distribution with a specified *mean* (zero) plus two more specified quantiles? – Stephan Kolassa Dec 17 '21 at 13:58
  • 1
    Since you also need to worry about central tendency and tail heaviness, 4 parameters may be required (location, scale, tail heaviness, asymmetry) as in the skew-t distribution. But also look up Tukey's generalized lambda distribution. – Frank Harrell Dec 17 '21 at 14:05
  • 1
    @StephanKolassa I apologise for not being specific. I have data in a form like *x -dx +Dx*, where *x*, *x-dx* and *x+Dx* are the median, 16th percentile and 84th percentile. If I set the skewness parameter of a skew normal to infinity, I can't get "skewer" than dx/Dx=1.55. Quoting Wikipedia, "Note, however, that the skewness (γ₁) of the distribution is limited to the interval (−1, 1)." – Warrick Dec 17 '21 at 14:51
  • Skewed Lambert W x normal or heavy tail Lambert W x normal with double tails would work for you. – Georg M. Goerg Dec 17 '21 at 17:08
  • Think of the set of all distributions as a space of points. (It is larger than any finite-dimensional space.) The Normal distributions form a curved surface within that space. (It's a surface between Normal distributions are described with two parameters.) Adding a third parameter for skewness is, geometrically, asking to find some 3D manifold containing this surface. It's analogous to finding a curve passing through a given point: there is a myriad variety of possible solutions. This isn't the way to go about solving your problem. Look for distributions appropriate for your data. – whuber Dec 17 '21 at 18:03
  • (Continued). "Appropriate for your data" means that information about what your data mean, how they were generated, how they were collected, and how they were measured often can suggest suitable families of distributions to use for modeling them. What can you tell us about these things? – whuber Dec 17 '21 at 18:04

1 Answers1

0

Need more information. The skewness $=\frac{\left(3 e^{k^2}-e^{3 k^2}-2\right) \text{sgn}(k)}{\left(e^{k^2}-1\right)^{3/2}}$ of the Version 2 generalization of the normal distribution is numerically unboundedenter image description here

I think perhaps that skewness alone may not be a sufficient criterion to allow for the answer sought. Either that, or perhaps this is an answer in that the OP seems to be looking at a more restricted version of skewed normal than Version 2 generalized normal distribution. There are other possible answers, for example, via reparemeterization of the gamma distribution in this answer. That involves semi-infinite support of a shifted gamma distribution. I am not sure but part of the problem may be related to the support, that is, if may be difficult to require infinite support AND require infinite skewness.

Carl
  • 11,532
  • 7
  • 45
  • 102
  • This plot makes no sense as a depiction of any probability distribution: by having negative values it cannot possibly be a PDF or CDF; and by having absolute values exceeding $1,$ it cannot be a characteristic function. Are you sure you aren't just plotting numerical error? – whuber Dec 17 '21 at 18:42
  • @whuber You are looking at a plot of skewness, and I say so in the answer, that is, if it is an answer. – Carl Dec 17 '21 at 18:44
  • I see, thank you. You don't explicitly say this is a plot of the skewness and in the context it would be natural to interpret that plot (as I did) as somehow depicting the distribution, not its skewness. I don't understand the point of the plot, though, because it's unclear how it is intended to answer the question. – whuber Dec 17 '21 at 18:47
  • @whuber OK, better now? The point is that the OP gave an example of skewness that appears to be range restricted, see [skewness in this](https://en.wikipedia.org/wiki/Skew_normal_distributionhttps://en.wikipedia.org/wiki/Skew_normal_distribution), and then mentioned Version 2 Generalized ND, which is not range restricted, but only has semi-infinite support. It appears to be easy to answer the question when semi-infinite support is allowed, but this is not discussed in the question, and needs to be discussed. – Carl Dec 17 '21 at 18:54
  • Might I suggest the references to that distribution and even to skewness are irrelevant to useful answers? They reflect the OP's efforts rather than a suitable approach to such an analysis. IMHO, of far greater importance than attending to skewness would be a consideration of how to model data in the tails, because that extrapolation depends crucially on the model. Skewness is only a crude way to approach that issue. – whuber Dec 17 '21 at 18:57
  • 1
    @whuber Of course you can suggest that, as did I "I think perhaps that skewness alone may not be a sufficient criterion..." However, it does beg the question, and perhaps it would be better to ask the OP exactly why he wants what he wants. For example, why require asymptotic normality? I am also very interested in tests for tail type, but maybe a different question would be better to get there. – Carl Dec 17 '21 at 20:11
  • I apologise that I wasn't clear that the "Version 2" generalised normal distribution is the only distribution I've found so far *that satisfies the two criteria in the question*. I'm curious to know if there are others. – Warrick Dec 18 '21 at 13:50
  • 1
    Gave you a link to one. True you have to reciprocate a parameter to obtain your desired limit of ND for a limit to zero. – Carl Dec 18 '21 at 14:40