2

I frequently encounter variables with values restricted in a known interval but otherwise looking like being normally distributed. A typical one is, say, time spent browsing internet each day. I am aware of several rescalings of the normal distribution that I might use - say, Beta, or Johnson's $S_B$, they all come in families. But are there any rigorous criteria to decide when to use which of them?

1 Answers1

1

I'm not familiar with any rigorous criteria, especially because in my experience people violate assumption for their models all the time.

One criteria I think is important is where on the defined interval most your data lies. For example, time per day, bound between [0,24]. If you average was, say 8 hours with a SD of 1, you would have very little data on the extreme ends, and therefore normal may be appropriate.

However, if the mean is like, 2 or 3 with a SD of 1(probably a more realistic case, probably also right skewed) something like a zero inflated poison may be more helpful.

Molls
  • 80
  • 6