Questions tagged [two-sample]

The two-sample problem is: given samples X and Y from two distributions, test whether the two underlying distributions are the same. One of the most common classical nonparametric approach is the Kolmogorov-Smirnov test.

The two-sample problem is to compare parameters of two (potentially) different populations given samples of them.

An extreme case concerns testing whether the populations have the same distribution. A classical solution is the Kolmogorov-Smirnov test.

88 questions
21
votes
2 answers

2 Sample Kolmogorov-Smirnov vs. Anderson-Darling vs Cramer-von-Mises

I was wondering what are the criteria to use Kolmogorov-Smirnov, Cramer-von-Mises, and Anderson-Darling when comparing 2 ECDFS. I know the mathematics of how each differ, but if I have some ECDF data, how would I know which test is appropriate to…
6
votes
1 answer

Earth Movers Distance and Maximum Mean Discrepency

By Kantorovich-Rubinstein duality the Earth Movers Distance (EMD)/Wasserstein Metric is equivalent to Maximum Mean Discrepancy (MMD) correct? See here for a more thorough explanation. Why then does the original Kernel MMD paper compare their…
6
votes
3 answers

A Kernel Two Sample Test and Curse of Dimensionality

Gretton et al describes the Kernel Maximum Mean Discrepancy, a measure of distance between distributions. In order to compare two distributions, it turns out you can do much better than, say, taking the L2 distance between density estimates. As…
5
votes
1 answer

Why am I observing non-uniformly distributed (negatively skewed) p-values for two-sample tests of mixture distributions when the null is true?

I am interested in generating Gaussian mixture distributions as the null distributions for a series of two-sample test simulations. It is a well established fact that p-values follow a uniform distribution when the null hypothesis is true, and…
5
votes
2 answers

what is the difference between a two-sample t-test and a paired t-test

While I was glancing at hypothesis tests, I saw paired and two-sample t-test but couldn't understand the difference. For the explanation of these two tests, I saw the following sentence " Two-sample t-test is used when the data of two samples are…
4
votes
2 answers

confidence interval for 2-sample t test with scipy

from scipy import stats import numpy as np ts1 = np.array([11,9,10,11,10,12,9,11,12,9]) ts2 = np.array([11,13,10,13,12,9,11,12,12,11]) r = stats.ttest_ind(ts1, ts2, equal_var=False) print(r.statistic, r.pvalue) The null hypothesis is that the…
Florin Andrei
  • 163
  • 1
  • 6
4
votes
1 answer

Can Friedman's test be used with two samples?

When talking about Friedman's test, it commonly comes accompanied by a whole name of "The Friedman's test for three or more correlated samples". The question is, could results be valid if I apply the Friedman's test to two correlated samples? Or is…
4
votes
1 answer

Method to justify claim that two samples come from the same distribution

I know of ways to test "whether" two data sets come from the same distribution, in the sense that I can treat the hypothesis that they are from the same distribution as the null hypothesis. However, I want evidence that the hypothesis of sameness…
Mars
  • 888
  • 2
  • 10
  • 20
3
votes
3 answers

T-test states difference of donation is significant when Z-test claims not, what method to use?

I have two populations who have been exposed to two different websites that should bring them to donations: one with a progress bar that pushes them to give (B, segment 2) and the other not (A, segment 1). And with log(y): I have noticed that, on…
3
votes
1 answer

Why is the two-sample test giving me inconsistent results?

I am applying a two-sample t-test to determine whether we have software regressions on latency measurements. Procedure Run the test for build b1 and gather 60 latency measurements. Run the test for build b2 and gather 60 latency…
Klik
  • 157
  • 4
3
votes
2 answers

Testing for equal proportions when sample sizes are very small

Suppose I observe binary data for two samples (hopefully the notation below is obvious) and I wish to test the hypotheses: $$H_0: p_1 = p_2$$ $$H_A: p_1 \neq p_2$$ I know there is a $z$-test for doing this but such tests are asymptotic tests. What…
3
votes
0 answers

R function for weighted two-sample t-test **with Welch-adjusted t statistic**?

I'm conducting a hypothesis test for the difference between two groups. The file dat1 contains all observations of measure for the first group; dat2 contains all observations of measure for the second group. Measures of both groups are normally…
3
votes
3 answers

Two one-sided hypothesis tests instead of a two-sided test?

In hypothesis testing, the guidance is to use a one-sided test (alternative "greater" or "lesser") if we don't care about errors in one of the directions. If we do care about errors in both directions and are very good at running one-sided tests,…
ryu576
  • 2,220
  • 1
  • 16
  • 25
3
votes
0 answers

Two-sample KS test null distribution, reference request

I follow, more or less, the derivation of the KS test statistics's distribution that is given on Wikipedia. The following section on the two-sample test also makes sense if all I want to do is reject the null hypothesis of distribution equality.…
Dave
  • 28,473
  • 4
  • 52
  • 104
3
votes
1 answer

Robust two-sample test with triplicate measurements?

When testing for a difference in mean between two conditions, biologists typically use a $t$-test, and wring their hands endlessly about how to justify removing outliers. Whereas I typically use a Mann-Whitney U test, which is robust to the presence…
user54038
  • 493
  • 3
  • 9
1
2 3 4 5 6