I have a question about when to collapse raw data to means per unit (e.g., subjects). In my data I have the following variables:
id
: subject idrt
: reaction timetype
: either A or Bbeing
: either animal, human, robot or plant
The structure of the data is that I have 100 trials per subject and each trial has an rt
, a type
and a being
.
If I use two different collapsing methods, I get different values:
Method A:
I collapse my data so that I have a mean rt
for each subject for each combination of type
and being
. Now I want to collapse the being
values human
and robot
together and the values animal
and plant
together. So I add them and divide them by 2 (or use the mean function).
So I get: MeanA_human&robot
and MeanA_animalplant
Method B:
I create a factor (e.g., being_category
) which is 1 for human or robot and 2 for animal and plant, and then collapse per subject id
, type
and this factor.
Here I get: MeanB_human&robot
and MeanB_animalplant
My problem is that MeanA_humanrobot
is not exactly equal to MeanB_humanrobot
(same for ..animalplant).
The differences are small, but I do not understand conceptually why there are any differences.
So basically - I think - this comes down to the question of when to collapse the data. Can someone help out here?