2

I'm modelling data from a behavioural task. Participants do a few hundred trials. On each trial, they see a sequence of letters at a point on the screen and one of these letters appears surrounded by a white circle. Their task is to report the letter within the circle. Any response they make can be mapped onto a point in time on the trial relative to the white circle because there are no repeats in the sequences that they see. A response of the letter in the circle would have a value of 0; one letter before would have a score of -1; the letter after would have a score of 1 and so on. We've been modelling the distribution of these temporal errors with a mixture of a uniform distribution and some other, non-uniform distribution. Up until now the non-uniform distribution has been Gaussian, but certain theoretical considerations have led us to consider that we need a positively skewed component with a domain that is bounded at zero instead of the Gaussian. I considered using the lognormal distribution, but this is a bad choice because it is undefined at zero.

What positively skewed distributions can model values of zero and greater?

I'm using Matlab. Something that has a PDF written in that language would be great (I'm a scientist, not a statistician).

ivan.k
  • 31
  • 2
  • 1
    Truncating nearly any distribution below at $0$ is going to increase its skewness. That gives you a huge array of possible solutions (and every solution can be expressed in such a form). To keep this thread from being overly (and uselessly) broad, could you please be more specific about what you're modeling and what you're trying to accomplish? – whuber Jul 31 '18 at 11:41
  • Thanks @whuber. I've edited the post with more detail – ivan.k Aug 01 '18 at 01:14

1 Answers1

1

You could take any distribution defined on $(0,\infty)$ and simply shift it to the left by a small $\epsilon$. Then you would get negative values in $(-\epsilon,0)$, which I guess is not what you are looking for.

Alternatively, your skewed component might derive from a two-stage data generating process that might either generate a zero or a nonzero value. If this sounds like something that might be present in your application, you might want to look into , i.e., s between a point mass at zero and a second distribution, which in turn could be supported on $(0,\infty)$ or $[0,\infty)$.

Zero inflated models are more common with , but there are applications for zero-inflated gamma distributions. And the gamma distribution has the advantage of being positively skewed. The skewness of a mixture is not overly hard to calculate.

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357