0

I have a mixed effect model to model crop yield as a function of rainfall and temperature:

mdl <- lmer(yield ~ rainfall + I(rainfall^2) + I(temperature^2) +
       (1|location) + (1|year))

I look at the predicted values and the some values of predicted yield are negative. Now realistically, yield can never be negative. So what do I do:

(1) make all the predicted negative values of yield equal to 0. Does this makes sense?

OR

(2) log the observed yield value.

 mdl <- lmer(log(yield) ~ rainfall + I(rainfall^2) + I(temperature^2) +
        (1|location) + (1|year))

Backtransforming the logged predicted yield value will always be positive and hence my predicted yield will never be zero. However, taking the log of yield makes my residuals more skewed and violate the normality assumption.

Any advise?

I can provide data and plots if needed.

user53020
  • 635
  • 1
  • 5
  • 15
  • What do you mean that "taking the log of yield messes up my linear model assumptions"? [One of] the assumptions of a linear model is that negative predicted values are perfectly reasonable. – gung - Reinstate Monica Aug 08 '17 at 14:38
  • My raw data are normal but when I take a log of it, the residuals of the model are more skewed and violate the normality assumption. – user53020 Aug 08 '17 at 14:55
  • The raw data are normal, or the residuals are (see [here](https://stats.stackexchange.com/a/33320/7290))? Can you add some plots, eg, qq-plots of the residuals of the 2 models? Why do you have `year` as a random effect? – gung - Reinstate Monica Aug 08 '17 at 14:58

0 Answers0