2

My aim is to compare means among different groups, but I only have one single value (mean) for each group. I also have the total number of individuals per group. I don't have data for all individuals inside the group as I am comparing results from different papers, so the mean per group is all I have. In some cases I also have SD and the range.

My data looks like: Group A: n=20, mean= 6.58 Group B: n=2, mean=7.4 Group C: n=15, mean=3.2 ....

I would like to compare groups among them but I cannot use ANOVA as it needs the variance within a group. I don't know if there is any other tool to do that.

If anybody can help, I would appreciate it! Thanks!

LauraMon
  • 41
  • 3
  • If you don't have the variance in each group then you cannot compare the groups statistically. All you can do is make a comment along the lines of... the mean in group A is less/greater than the mean in group B. – Mihael Apr 30 '19 at 14:31
  • @Mihael It may be surprising that your assertions are not correct: it is possible to construct a confidence interval for the difference between two means, assuming the sampling distribution of the difference is Normal, knowing *only* the difference! This indicates that meaningful testing of these means is possible even without knowing the variances. – whuber Apr 30 '19 at 14:57
  • @whuber a Normal distribution is defined by both mean and variance. I fail to see how a Normal distribution can be useful here when the variance is unknown, without making assumptions. If you could write it out in an answer, I'd be interested to see it. – Mihael Apr 30 '19 at 15:39
  • @Mihael Please see https://stats.stackexchange.com/a/1836/919. – whuber Apr 30 '19 at 16:00

1 Answers1

4

The null hypothesis of ANOVA is that the within-group variances are equal and all group means are equal. Under this hypothesis, your data provide enough information for testing, provided you can justify assuming the sampling distributions of the group means are close to Normal.

Here's a brief analysis. To establish notation, let there be $d\ge 2$ groups of sizes $n_1, n_2, \ldots, n_d$ comprising $N=n_1+\cdots + n_d$ independent observations. Let the common variance be $\sigma^2$ and the common mean be $\mu.$ If $\mathcal{L}$ is the likelihood, write $\Lambda = -2\log(\mathcal L),$ which is minimized when the likelihood is maximized.

These assumptions imply the group means $x_i$ have Normal$(\mu, \sigma^2/n_i)$ distributions. Therefore

$$\Lambda = \sum_{i=1}^d \log\left(\frac{\sigma^2}{n_i}\right) + \frac{(x_i-\mu)^2}{\sigma^2/n_i}.$$

Critical values of the gradient of $\Lambda$ produce the familiar estimates

$$\hat \mu = \frac{1}{N}\sum_{i=1}^d n_i x_i$$

and

$$\hat\sigma^2 = \frac{1}{d}\sum_{i=1}^d n_i(x_i-\hat\mu)^2.$$

At this point it is natural to re-express the data in terms of Z scores as

$$z_i = \frac{x_i - \hat\mu}{\sqrt{\hat\sigma^2 / n_i}}$$

because each of these is approximately standard Normal and they are approximately independent. Consequently, you could inspect this set of Z scores for deviations from a standard Normal distribution. You might use a Normal-theory outlier test, for instance; or you could construct a statistic such as the Kolmogorov-Smirnov statistic or Anderson-Darling statistic. For formal testing, bootstrapping the distribution of this statistic from the estimates will work (and helps us avoid extensive further analysis!).


As an example, the estimates for the data $X=(6.58, 7.4, 3.2)$ with group sizes $n=(20, 2, 15)$ are

$$(\hat \mu, \hat\sigma^2) = (5.25, 35.89)$$

with corresponding Z scores

$$z = (0.99, 0.51, -1.33).$$

These aren't unusual, so we haven't found significant evidence of a difference in group means. Indeed, it's scarcely possible to do so with just three groups--one of the means would have to be far from the other two.

Suppose, though, that you have more data from more studies and therefore you have means of more than three groups. Then this approach could yield significant results. As an example, consider ten groups with means $x=(1,2,\ldots, 9, 30).$ That last value of $30$ is extreme. The estimated parameters are now $\hat\mu=7.5$ and $\hat\sigma^2=124.5$ with Z scores

$$z = (-0.82, -0.70, \ldots, 0.06, 0.19, 2.85).$$

That last one ($2.85$) is a little unusual and indeed, the parametric boostrap of the KS statistic gives a p-value of $0.011,$ indicating a significant difference.

The following R code gives an implementation. The p-value for the data in the question is $0.605.$

#
# Data.
#
x <- c(6.58, 7.4, 3.2)
n <- c(20, 2, 15)

# x <- c(1:9, 30)
# n <- rep(2, length(x))
#
# Parameter estimates.
#
theta.hat <- function(x, n) {
  N <- sum(n)
  m <- sum(n*x) / N
  s2 <- mean(n*(x-m)^2)
  c(x.hat=m, s2.hat=s2)
}
(theta <- theta.hat(x, n))
#
# Bootstrap statistic.
#
KS.stat <- function(x, theta, n) {
  max(abs((1:length(x))/(length(x)+1) - sort(pnorm(x, theta["x.hat"], sqrt(theta["s2.hat"]/n)))))
}
stat <- KS.stat(x, theta, n)
#
# Display Z scores.
#
print(signif(x - theta["x.hat"]) / sqrt(theta["s2.hat"] / n), 2)
#
# Bootstrap the statistic.
#
sim <- replicate(1e4, {
  y <- rnorm(length(n), theta["x.hat"], sqrt(theta["s2.hat"]/n))
  p <- theta.hat(y, n)
  KS.stat(y, p, n)
})
print(c(`p-value`=mean(c(stat, sim) >= stat)))
whuber
  • 281,159
  • 54
  • 637
  • 1,101