0

I have specified a mixed model in R, based on a data set containing advertising awareness as DV and advertising spends as IV. The effect of the advertising spends on awareness is moderated by two dimensions of brand perception, Dim1 and Dim2 and Christmas and Easter time are coded as dummmies. The advertising awareness of the previous period is also included as a predictor. The data set includes 11000 observations.

lmer(AdAwareness ~ AudioVisual*(Dim1 + Dim2) + Audio*(Dim1 + Dim2) + AdAwareness_t.1 + AudioVisual*(Dummy_Christmas + Dummy_Easter) + Audio*(Dummy_Christmas + Dummy_Easter) + (1 | brand), data = NYG)

To enable convergence, the data has been standardized (z-transformed).

Question 1: I have been testing the assumption of heteroscedasticity with the Levene test, which is fine. I also investigated linearity by plotting residuals against fitted values.

enter image description here

I am not sure how to interpret the plot. Can I still work with the assumption of linearity or should other tests be conducted? And would the single data points suggest to check the data for potential outliers?

Question 2: As described above, the model includes a lagged variable, which has been described as critical here.

https://statisticalhorizons.com/lagged-dependent-variables

The reason is that the assumption of independence of IVs cannot be supported - which I do not necessarily see violated in the model shown above. Would you in general consider the inclusion of lagged variables into mixed models critical? I am aware of the discussions around the inclusion of LDV on this site, but did not find too much specifically on mixed models.

Alexis
  • 26,219
  • 5
  • 78
  • 131
FeWa
  • 1
  • 1
  • On the term _versus_, see https://stats.stackexchange.com/questions/146533/versus-vs-how-to-properly-use-this-word-in-data-analysis More crucially for your post, plotting residuals on the vertical axis is much more common in what I see. It's the same information either way, but I see a majority of residuals near zero and a tilt in the others, which suggests some misspecification. That is all assuming that your "fitted" really is fitted, not the original response values. Your Question 2 is no doubt a good one, but hard to discuss on the evidence given here. – Nick Cox Apr 18 '20 at 09:19
  • I'd expect a variable like expenditure to be highly skewed, and possibly to include many zeros. It's hard work to get linear models to capture that and a battery of indicator variables won't help much. – Nick Cox Apr 18 '20 at 09:22

0 Answers0