I want to study the determinants of homicide rates. However, I see when exploring the data that my dependent variable (homicide rates) has many zeros and is positive skewed. Which distribution family / approach should I use?
Thanks in advance
I want to study the determinants of homicide rates. However, I see when exploring the data that my dependent variable (homicide rates) has many zeros and is positive skewed. Which distribution family / approach should I use?
Thanks in advance
In general, these types of data can be analyzed using either via hurdle modelling or via zero-inflated modelling.
Which type of modelling you would use is a function of whether you can assume that there is only one process by which a zero rate can be produced (in which case you would use hurdle modelling), or that there are are two different processes that can produce a zero (in which case you would use zero-inflated modelling). See this link to this Cross Validated post for further details: What is the difference between zero-inflated and hurdle models?.