2

Since my research data seems to follow log-normal distribution, I was curious to learn more about the topic. In addition to very nice answers here on Cross Validated (In linear regression, when is it appropriate to use the log of an independent variable instead of the actual values?), I've found two quite interesting sources. The first source is a general review paper on log-normal distributions and their role in life and various scientific disciplines: http://stat.ethz.ch/~stahel/lognormal/bioscience.pdf. The second source is a paper, which essence is expressed right in its title "Do not log-transform count data": http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2010.00021.x/full.

Now, my questions:

1) Based on the first paper, which emphasizes the multiplicative nature of log-normal distribution, does it makes sense to argue that, if my data, after log transformation, consists of several normal distributions (mixture model), it can be explained by presence of several types of interacting factors (one based on detected log-normality, another - on detected mixture)?

2) Should I accept advice from the second paper and abandon potentially valuable information, as described above, and, instead, use Poisson distribution for data transformation? Especially considering that the ultimate goal of my research study is latent variable modeling.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Aleksandr Blekh
  • 7,867
  • 2
  • 27
  • 93
  • 4
    There is no major contradiction between the main ideas in these papers. The lognormal is a possible distribution for positive continuous variables, and 0 could not be a possible value for such distributions. Conversely, a common starting point for counts that could be zero is a Poisson distribution. There is some territory in between these questions, but they are different questions. I think just mentioning "latent variable modeling" doesn't give enough detail to allow detailed advice. Similarly, lognormals don't themselves imply that you have a mixture of normals. – Nick Cox Sep 16 '14 at 16:41
  • @NickCox: Thank you for the comment! Now I have a little more clarity on the subject. What information on my data and/or SEM models would be enough to expect additional feedback? As for the presence of mixture of normals, I was referring to my particular situation, and I verified it, using both visual and analytic approaches. – Aleksandr Blekh Sep 16 '14 at 16:51
  • 2
    I guess you need new questions on either or both of those issues. But as I don't know what your extra questions are going to be, I can't well advise how to pose them. – Nick Cox Sep 16 '14 at 16:57
  • @NickCox: Thank you. I will be posting new questions as soon as I will be able to decently formulate them. – Aleksandr Blekh Sep 16 '14 at 17:01

0 Answers0