Questions tagged [group-differences]

Group differences broadly refer to statistics which quantify the differences between two or more subpopulations.

Group differences broadly refer to statistics which quantify the differences between two or more subpopulations. Examples include

  • medical trials where one group receives a drug and the others does not
  • the 90-10 earnings differential within certain education brackets
  • crime rates across social groups and districts within a city

A large array of techniques is available for assessing such differences. Common between-group tests are the Chow test, matching methods, difference-in-differences, $\chi^2$ and t-tests or Wilks' lambda. Sometimes also the within-group properties are of interest or because we may want to compare the variability within groups across groups. For instance, ANOVA can be used to separate the within- and between-variability of group data.

With observational data a common concern is that there may be unobservable factors which have led individuals to self-select into certain groups. Therefore group differences are mostly descriptive unless the data comes from a randomized experiment or methods for causal inference like matching of difference-in-differences are applied.

578 questions
26
votes
3 answers

Logistic regression or T test?

A group of persons answers one question. The answer can be "yes" or "no". The researcher wants to know whether age is associated with the type of answer. The association was assessed by doing a logistic regression where age is the explanatory…
Gwen
  • 533
  • 3
  • 6
  • 10
21
votes
4 answers

Comparison of ranked lists

Suppose that two groups, comprising $n_1$ and $n_2$ each rank a set of 25 items from most to least important. What are the best ways to compare these rankings? Clearly, it is possible to do 25 Mann-Whitney U tests, but this would result in 25 test…
Peter Flom
  • 94,055
  • 35
  • 143
  • 276
21
votes
5 answers

What is effect size... and why is it even useful?

I have an introductory-graduate-level statistics background (assume I know mathematical statistics and probability at an undergraduate level (e.g., Wackerly et al., Ross' Probability), and have some knowledge of measure theory). I have recently…
Clarinetist
  • 3,761
  • 3
  • 25
  • 70
16
votes
2 answers

Unequal sample sizes: When to call it quits

I'm peer reviewing an academic journal article and the authors wrote the following as justification for not reporting any inferential statistics (I deidentified the nature of the two groups): In total, 25 of the 2,349 (1.1%) respondents reported…
15
votes
7 answers

Why the p-value of t.test() is not statistically significant when mean values look really different

I am trying to find out whether there is a significant difference in the mean value of a biomarker between two groups. I am using t.test in R. The mean(SD) values are 1142(1079) and 864(922)in the groups. But the p-values of the test shows the…
12
votes
3 answers

How to test whether subgroup mean differs from overall group that includes the subgroup?

How can I test whether the mean (e.g., blood pressure) of a subgroup (e.g., those who died) differs from the whole group (e.g., everyone who had the disease including those that died)? Clearly, the first one is a subgroup of the second one. What…
user1061210
  • 1,005
  • 3
  • 13
  • 19
11
votes
3 answers

Welch test seems to perform much worse than equal variance t-test

The SciPy function in Python, ttest_ind() by default works with the $t$-test that assumes equal variances. There is a parameter, equal_var = False that switches it to the Welch test where equal variances in the two samples are not assumed. This…
11
votes
0 answers

10 % false positives from nonlinear mixed effect models : Why?

I've run a simulation study in order to estimate type I error rate of the test of group effect in a nonlinear mixed effects model, using nlmer from lme4 package. The results show there is 8-10 % false-positives. I wonder if something went wrong in…
10
votes
2 answers

In a meta-analysis, how should one handle non-significant studies containing no raw data?

Let's say that I'm conducting a meta-analysis, looking at the performance of group A and group B with respect to a certain construct. Now, some of the studies that I'll come across will report that no statistical differences could be found between…
10
votes
2 answers

Compare the statistical significance of the difference between two polynomial regressions in R

So first of all I did some research on this forum, and I know extremely similar questions have been asked but they usually haven't been answered properly or sometimes the answer are simply not detailed enough for me to understand. So this time my…
9
votes
2 answers

Investigating differences between populations

Say we have a sample from two populations: A and B. Let's assume these populations are made of individuals and we choose to describe individuals in terms of features. Some of these features are categorical (e.g. do they drive to work?) and some are…
Amelio Vazquez-Reina
  • 17,546
  • 26
  • 74
  • 110
8
votes
0 answers

Two-sample bootstrap?

I have two independent samples of observations. From each sample I produce a statistic. Let's denote these as $\theta_1$ and $\theta_2$. I'd like to test the hypothesis that $H_0: \Theta_1=\Theta_2$, but I have these two constraints: There is no…
Trisoloriansunscreen
  • 1,669
  • 12
  • 25
8
votes
3 answers

How to compare ranked data?

I have some questions about how to analyze ranked data. The data looks like this: 4 groups of people with HIV and 16 other groups of people living in the same village were asked to rank 12 challenges for people with HIV according to importance.…
jacky
  • 81
  • 1
  • 2
8
votes
1 answer

Comparing distributions of unequal sample sizes

Consider the distribution shown in the below histogram: I have computed a Welch's t-test for a difference in means between these two groups, as well as a Kruskall-Wallis test to see whether these two groups come from the same distribution. Both…
blacksite
  • 614
  • 1
  • 10
  • 22
8
votes
2 answers

Cook's distance in detecting outliers

According to my understanding, Cook's distance measures the influence of each observation by excluding points when fitting a model. So I assume it could be an reasonable approach for outlier detection? My questions, assume data are categorized into…
Roy C
  • 103
  • 1
  • 5
1
2 3
38 39