Decide for three Bernoulli samples are some of them from the same distribution (or non of them) ? (do not use t-stat )

Question

Please pay attention, I am interested in Bernoulli samples, and hope to find criteria specific to Bernoulli distribution, not using s Student's t-statistics or Mann-Whitney or etc., since their use relies on asymptotic assumptions, which I would prefer to get rid off. (Because for large sample sizes it is Okay to use asymptotic normality, but not for small samples sizes, which might occur in my situation).

Question: Consider three Bernoulli samples (0,1,..1), (1,1,..0), (0,0..,1). of possibly different length. I want to understand are some of them from the same distribution or non of them ?

I would expect some kind of "p-value" calculation as an answer to the question. My problem is that I do not see how to get it for Bernoulli distibution without relying on some asymptotic normality.

Motivation The question is typical for machine learning binary classification tasks, at the stage of feature preprocessing, where one the ways to preprocess
categorical features -- is to merge groups with small number of observations to larger groups, thus getting more stable estimators.

PS

Differnce with the already asked question:

There is nice discussion on related question: Principled way of collapsing categorical variables with many levels?

However the difference is that I am interested at a specific situation of the Bernoulli sample (or in machine learning language binary classification problem), while the later question/answers deal with the general situation of generic/unknown distribution of target variable. Why cannot hope in general to have some analytical answer, while for specific case of Bernoulli disribution it sounds like a classical question which, most probably, have been addressed in literature.

Possibly use Fisher exact tests. Supply numbers of successes and of failures for each sample. — BruceET, Sep 13 '19 at 06:08
From your **Motivation** we that what you want is to have some criterion for joining levels of a nominal variable with many levels. That is already answered in https://stats.stackexchange.com/questions/146907/principled-way-of-collapsing-categorical-variables-with-many-levels — kjetil b halvorsen, Sep 13 '19 at 09:24
@kjetilbhalvorsen it is not a duplicate, I explained the difference, please remove tag — Alexander Chervov, Sep 13 '19 at 18:21
If the current answer following from @BruceET comment meets your needs can you accept the answer or edit your question to explain why? Otherwise it is not clear what exactly you are asking. — mdewey, Sep 15 '19 at 13:55
@mdewey we thought about Fisher's exact test and came to the conclusion it is not about what we need. I will present detailed arguments later on. The problem is in some details of application of that test, we have fixed lenghts of samples, and it is not what we have in that test, as far as I remember, I need some time to revise our considerations in my memory. — Alexander Chervov, Sep 15 '19 at 16:43
If the fact fixed/variable lenghts of the samples is what concerns you I recommend having a look into the Princeton material I cited in my answer. It derives an appropirate testing procedure for different scenarios with fixed/variable lenghts (or margins as they call them in the article) and show that all these scenarios at the end reduce to the same statistical test. I truly recommend this material. Really illuminating reading. — Wassermann, Sep 15 '19 at 18:46

Wassermann · Answer 1 · 2019-09-14T18:29:18.727

I agree with BruceET comment about Fisher's Exact Test:

The only way two Bernoulli distributions may differ from each other is by having different $p$ parameters i.e. the probability of success (it has nothing to do with p-value). So basically you have samples from Bernoulli distribution and you want to do a statistical test if the probabilities of success in these distributions are the same. Number of successes in a Bernoulli sample of size $n$ is a single draw from a Binomial distribution with parameters $(n,p)$ and the exact test for comparing two Binomial proportions is Fisher's Exact Test (there is a lenghty discussion if in this case the exact test is really the best one and there are numbers of candidates to give in some sense better results than Fisher's Test, but this is a topic for a separate discussion).

You can find a great discussion on comparing binomial proportions under this thread: Exact two sample proportions binomial test in R (and some strange p-values)

For a theory justifying usage of this test in your case you can look in these two resources:

https://en.wikipedia.org/wiki/Fisher%27s_exact_test
https://data.princeton.edu/wws509/notes/c5.pdf (Section 5.1.4)

Can anyone comment on whether Bernard's test would be more appropriate than Fisher's exact test? — Sal Mangiafico, Sep 14 '19 at 18:53

Decide for three Bernoulli samples are some of them from the same distribution (or non of them) ? (do not use t-stat )

1 Answers1