3

I'm using R's kappa2 function (irr package) to obtain Cohen's Kappa and the cohen.kappa function (psych package) for the corresponding confidence intervals. In case of significant interrater agreement I usually obtain a p-value ≤ 0.05 and a 95% CI that doesn't include 0. Now, for the first time, the output is a significant p-value, but the 95% CI spans 0. Is there another explanation for this discrepancy, other than my code being erroneous?

> x <- matrix(c(y,z), nrow=49, ncol=2, byrow=FALSE)
> kappa2(x, weight="unweighted", sort.levels="FALSE")

Cohen's Kappa for 2 Raters (Weights: unweighted)
 Subjects = 49 
   Raters = 2 
    Kappa = 0.222 

        z = 2.47 
  p-value = 0.0133 

> cohen.kappa(x)

Call: cohen.kappa1(x = x, w = w, n.obs = n.obs, alpha = alpha, levels = levels)

Cohen Kappa and Weighted Kappa correlation coefficients and confidence boundaries 
                 lower estimate upper
unweighted kappa -0.14     0.22  0.59
weighted kappa   -0.14     0.22  0.59

 Number of subjects = 49 
Brian D
  • 129
  • 10
user302969
  • 31
  • 1
  • It would be helpful if you included `y` and `z` data in your question for a fully reproducible example. – Brian D May 10 '21 at 15:29

2 Answers2

2

There are two ways of computing the standard error of $\kappa$. One is assuming the true value is zero and is appropriate for testing whether $\kappa = 0$, a hypothesis which is seldom of any interest. The other does not assume that and is appropriate for setting confidence intervals. Without a knowledge of the internals of those packages it is impossible to be sure that this is the solution but it seems likely.

The issue is discussed in a paper by Fleiss and colleagues entitled "Large sample standard errors of kappa and weighted kappa"

mdewey
  • 16,541
  • 22
  • 30
  • 57
  • (+1) The `psych::cohen.kappa` routine uses formula (8) in that article, which seems to be appropriate for computing a confidence interval, and not only test $H_0: \kappa = 0$. – chl Nov 17 '20 at 16:21
  • I believe that `irr::kappa2` uses formula (9) in that article. – Brian D Feb 08 '21 at 16:17
1

If you show the output from str(cohen.kappa(x)), it includes the weighted and unweighted variance values.

You can compare these values to those obtained from irr::kappa2(), which does not report the variance by default. However, based on the kappa2 function code, it appears you can calculate the variance using:

((out$value/out$statistic))^2, where out = kappa2(x, weight="unweighted").

Note that this is typically a different value than that reported by psych::cohen.kappa - the two functions use slightly different equations for kappa - which is why you are noticing the disparity.

Using the computed variance from kappa2, you can see that its CI does not contain 0, which is why the p-value is 'significant' :

.222 + qnorm(c(.025,.975))*(.222/2.47)
[1] 0.04584129 0.39815871
Brian D
  • 129
  • 10
  • Hi, @Brian D Could you please explain the formula used here to calculate the confidence interval for kappa (`.222 + qnorm(c(.025,.975))*(.222/2.47`)? The formula looks different from what I saw else where (https://stats.stackexchange.com/questions/208613/get-a-95-confidence-interval-for-cohens-kappa-in-r) and not sure whether the two are actually doing the same thing. – Chloe May 05 '21 at 02:26
  • 1
    Looking at the source code for `irr::kappa2`, you can see that the p-value is computed as `p.value – Brian D May 10 '21 at 15:39