Let's say I have two sets of measurements that go on like this:
Subject Measurement A Meas. B How many measurements of this type do I have?
1 a1,a2 - 1
2 - - 1
3 a3,a4,a5 b1 1
4 a6 b2,b3 1
5 a7 - 6
6 a8,a9 b4 16
7 a10,a11,a12,a13 b5 3
8 - b6 3
9 a13 b7 76
I would like to run a paired test on them but I am not sure what to do about those more than two measurements for one pair. Should I take and average if there is more then one measurement? Should I pair each measurement with the corresponding measurement so some values would be count more than once (like a8 with b4 and a9 with b4)? Should I pick one of each measurements where there are more than one measurements (so pick either a3,a4 or a5 with b1)?
I feel that some weighting would be appropriate: i.e. use b5 paired with a10 a11, a12 and a13 with weight 1, measurements like for subjects 4 and 6 with weight 2, for subject 43 with weight 1.33 and for the rest with weight 4. But what would I do with unpaired data like subjects 1,2,5 and 8?
It is random if I have a nice paired measurements like for subject 9 (as listed in the last column, I have these suitable measurements for 76 subjects, so almost 70 % of the sample). If I run paired tests on these 76 results (Wilcoxon, the underlying data is almost normal but not completely), I get p values of 0.000001 and lower. Is it even worthy to try to use the rest? I am using scipy (and numpy).