0

How does one go about replacing true zeros in compositional data sets? There is a lot of information on handling rounded zeros, such as multiplicative replacement (MultRepl in R), but not on true zeros.

Kyle
  • 21
  • 3
  • What do you mean by “true zeros” and why do you want to replace them? If they are valid values, why would you want to replace them to incorrect ones? – Tim Aug 22 '21 at 13:13
  • My data contain true zero values in that no behaviour was observed at all. Rounded zeros refer to data that fall below the detection limit, and therefore are recorded as zero but do in fact have some small value. I want to statistically analyse time spent carrying out behaviours against some independent variables. However my data are non-normal compositional and thus need log transformed, which is not possible if data contain true zero values. If I can replace my zeros then I can proceed with my analyses. – Kyle Aug 22 '21 at 13:32
  • Why exactly you need the data to be normally distributed? There’s no way to make the data normally distributed if you have exact zeros. Also unlikely the statistical method you want to use makes assumptions about the data to be normally distributed. – Tim Aug 22 '21 at 18:22
  • It's not so much that the data needs be normally distributed, but more to take into account the compositional nature of the data, it needs log transformed. Standard multivariate techniques are inappropriate and uninterpretable for such data. – Kyle Aug 24 '21 at 07:04
  • *Why* it needs to be log-transformed? [You usually don't need to do this.](https://stats.stackexchange.com/questions/298/in-linear-regression-when-is-it-appropriate-to-use-the-log-of-an-independent-va?noredirect=1&lq=1) If you really need, you can always just transform it by adding `+ 1` to the variable. – Tim Aug 24 '21 at 07:10
  • Working with compositional data is new to me, but apparently in order to take into account the colinearity or relatedness constraint of the variable (as they are percentages and sum to 100%) it needs to be transformed. For some context https://ijbnpa.biomedcentral.com/articles/10.1186/s12966-018-0685-1. – Kyle Aug 24 '21 at 07:29
  • log transform does *nothing* about collinearity. – Tim Aug 24 '21 at 07:33
  • Apologies if I'm not understanding you correctly. It's just that everything I read on compositional data advises it needs transformed prior to analysis. I think arcsine transformations are also used. In your opinion then I should not transform the data? – Kyle Aug 24 '21 at 07:37
  • I'm not sure what are you exactly doing and what is your data. Maybe you could edit the question to add more details, maybe an example, or some plots, etc? Otherwise, it might be hard to get it answered. Definitely, it's not the case that you ought "by the default" to log transform variables, it is always done for a reason. – Tim Aug 24 '21 at 07:41
  • I'll update with some example data later today. I've recorded time spent (decimal hours and minutes) carrying out 4 different behaviours for cattle, in a given period (day or night). The sum of the 4 behaviours total the time for that period. As each period vary in length, I changed the values to percentage proportions. I then want to analyse how behaviours differ between Day or Night periods (and additionally whether milk status: Milking or Dry has any affect). Thank you so far for discussing this matter - it has been quite confusing for me to say the least. – Kyle Aug 24 '21 at 07:53
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/128884/discussion-between-kyle-and-tim). – Kyle Aug 24 '21 at 09:29

0 Answers0