I have a dataset of 23 participants from a within-subject experiment. The data for the dependent variable is integer (e.g., 1, 2, 3) and has a limited range (from 1 to 5). As most of the data is 1 or 2, the data has non-normal distribution even after log transformation. Is it still appropriate to fit a linear mixed effect model using this dataset?
Asked
Active
Viewed 105 times
2
-
Wat does "most of the data is 1 or 2" mean ? How are the data collected, and what are they measuring ? – Robert Long Sep 05 '20 at 11:29
-
@RobertLong In this experiment, each participant needs to read 500 sentences from 6 texts (i.e., around 80 sentences from one text). Each word in the sentence is identified as an interest area, and the dependent variable is the number of times the participant enters and then leaves the interest area of each word. As most participants only visit each target area for only 1 or 2 times (while for a small number of instances, there are 4 and 5 times), most data of the dependent variable therefore falls into 1 and 2. That is why the data does not have a normal distribution. – Chloe Sep 05 '20 at 12:29
-
There is no requirement for the data to be normally distributed. I'm still not sure I understand the design. What are your random effects ? Subjects and sentences ? Are they crossed ? Are you saying that each subject is only observed 1 or 2 times ? – Robert Long Sep 05 '20 at 12:48
-
@RobertLong, Sorry for not describing the design clearly. This is an eye tracking experiment, and I am collecting the number of times the eye looks at each word in each sentence from six different texts. If the participant only looks at the word once, then it is recorded as 1 in the dependent variable. And yes subjects and sentences are treated as cross random effects. I think my main question is that I am not sure whether normal distribution is required for depend variable to run the lmer model. Or is normal distribution only required for independent variable? – Chloe Sep 05 '20 at 13:28
-
There are no distributional requirements on either the dependent or independent variables. So I think you are saying that the dependent variable can be the integers 1 to 5 (can it ever be zero ?). How many responses do you have for each subject, and what is your research question ? – Robert Long Sep 05 '20 at 14:18
-
@RobertLong yes the dependent variable can be the integers 1 to 5 (no zero). Each subject has around 2000 responses (i.e., the participant read around 2000 words from 6 different texts). As the participants read the texts with different background sounds, my question is to explore whether different types of background sounds would have a significant impact on the number of times they read each word. I read from somewhere else that normal distribution is required before significance test is carried out. That is why I am confused here. – Chloe Sep 06 '20 at 02:15