I have what feels like a simple question, but was unable to find answers easily.
The situation
Let's say I have a gene microarray dataset with tens of thousands of genes and small (<100) number of samples. I am interested in simple mean differences between two sample groups. I do a t-test for each gene and get p-values. But none of them survive after the Bonferroni correction for multiple testing.
However I also see that there are 8% significant genes which I think is above chance. So instead I would like to claim that there are more significant genes then expected.
The problem
It feels like I cannot simply state that I expect 5% and 8% is above that so I have more. Because the genes are most likely not independent. Maybe it's not unlikely to get 8 percent and more.
So instead what I tried to do is permute the sample labels and see what fraction of permutations gives me 8% or more genes with significant differences. And if I see that only 1 percent of permutations gave me more than 8% of significant differences - then I state that there are more significant genes then expected and my permuted p-value is 0.01.
The questions
- Is this a valid approach?
- Are there better alternatives?
- Maybe somebody knows any literature related to this problem?