Simulating paired data gives 2 sample result!

Question

A pupil has come to me with data collected as follows. A questionnaire with 9 questions on part A and 19 on part B. Each question is either true/false or not answered. She coded the solutions with 1 for a correct solution, -1 for false and 0 for not answered. 38 subjects took part.

She wants to test whether the questions on part A were easier than those on part B.

It seems to me that we can treat this as a paired sample. So we calculate a mean score for each part and then take the B mean score away from the A mean score. This results in one column of data, the diffMeans and want to test whether its mean is greater than 0. This I believe we can do by way of a t-test and results in a p-value of 0.145.

Now my problem...

For my own interest I tried to create an appropriate simulation of the above situation using R. I recreated the experiment k=10000 times, the experiment being:

Generate 9 random integer values between -1 and 1 and divide by 9 to get a sampleA_Mean.
Do something similar to get a sampleB_Mean (where we generate 19 such values and divide by 19).
sampleDiff = sampleA_Mean-sampleA_Mean to get that simulated subject's difference in scores.
Do this 38 times.
Calculate this sample of 38's mean.

Calculating the p-value in two different ways, first by using the appropriate fitted normal distribution and then by calculating from the simulated data I get more or less the same value: 0.202.

This value seems to be the p-value of the associated 2 sample t-test!

So my questions: why do I get this result? Why does my simulation consider the data that's generated to be 2 samples rather than paired? Obviously I have missed the ability to pair the data appropriately, so how can this then be done?

Here is the R code of my simulation:

library(MASS)

data <-
structure(list(A = c(-4L, 0L, -2L, 2L, 1L, 1L, -3L, -1L, 3L, 
7L, 0L, -1L, 1L, 0L, 4L, 1L, 0L, 0L, -2L, -2L, 4L, 2L, 1L, 5L, 
1L, 6L, 1L, 3L, 0L, 6L, 2L, 1L, -1L, 0L, 0L, 3L, 0L, 2L), B = c(0.166666667, 
-1.666666667, 2, 5.5, 5.5, 4, -2.166666667, 0, -6.166666667, 
-1, 2, 2.166666667, 1.166666667, 1, 7.5, 5.666666667, 3, -1, 
-1, -2, 5.166666667, 2.166666667, 2.5, 8.833333333, 2.5, 3.166666667, 
6, 2, -5, 6, -0.333333333, -1.166666667, -4, 0.166666667, -1.666666667, 
-0.166666667, -2.833333333, 6)), .Names = c("A", "B"), class = "data.frame", row.names = c(NA, 
-38L))

##################################

actualA_Mean<-data1$A/9
actualB_Mean<-data1$B/19
actualDiff<-actualA_Mean-actualB_Mean

actualDiffMean<-mean(actualDiff)

##################################

sampleDiffMean<-NULL
for (k in 1:1000){
  sampleDiff<-NULL

  for (j in 1:38){
    sampleA<-0
    for (a in 1:9){
      pointsA<-sample(-1:1,1)
      sampleA<-sampleA+pointsA
      }
    sampleA_Mean<-sampleA/9

    sampleB<-0
    for (b in 1:19){
      pointsB<-sample(-1:1,1)
      sampleB<-sampleB+pointsB
      }
    sampleB_Mean<-sampleB/19

    sampleDiff[j]<-sampleA_Mean-sampleB_Mean
  }

  sampleDiffMean[k]<-mean(sampleDiff)
}

##################################

fit <- fitdistr(sampleDiffMean, "normal")
para <- fit$estimate
hist(sampleDiffMean,breaks=20, prob=TRUE)
curve(dnorm(x, para[1], para[2]), col = 2, add = TRUE)

##################################

pValueNormal<-pnorm(actualDiffMean, mean=para[1], sd=para[2], lower.tail=FALSE) 
cat("pValue (normal) = ",pValueNormal,"\n")

pValueSimulation<-length(sampleDiffMean[sampleDiffMean>=actualDiffMean])/length(sampleDiffMean)
cat("pValue (simulation) = ",pValueSimulation,"\n")

score 2 · Accepted Answer · answered Sep 02 '17 at 14:21

2

You get the same results because you have not put in any correlation between the scores on the two tests. This only makes sense if the two tests are completely unrelated, which seems unlikely.

answered Sep 02 '17 at 14:21

Peter Flom

94,055
35
143
276

How can I then simulate a correlation between the scores? Aren't we testing for the existence of such a correlation? – Geoff Sep 02 '17 at 15:04
No, you are testing for whether one test is harder than another. With real data, there would surely be some positive correlation, but it's not clear how strong it would be - that would depend on the exact nature of the tests. In simulation, you would have to assume some correlation - you could do it with several different correlations. – Peter Flom Sep 02 '17 at 15:22
Thanks for your responses. I think I get you now. Am I correct to say, that if two samples are not correlated then a two sample t test would be equivalent to the t test on the differences? In which case, this is what I have done above. – Geoff Sep 02 '17 at 16:44
Except that it is variables that are correlated, not samples. – Peter Flom Sep 02 '17 at 18:25
I think this answer https://stats.stackexchange.com/a/15040/20817 can play a role. – Geoff Sep 02 '17 at 19:40

Simulating paired data gives 2 sample result!

1 Answers1