Inflated p-value after adjusting for covariates, in GWAS

Question

I am working on some GWAS (Genome-Wide Association Studies) now. A genome scan was done for all the SNPs, with first 3 principal components adjusted (PCs are used for adjusting ethnicity effect) and the QQ plot looks fine (most p-values lay on the diagonal line with a few signals off the line).

However, when I adjusted for clinical covariates (treatment, clinical stage) and rerun the genome scan, the QQ plot looks weird. Usually, when you adjust for covariates, you expect to see the p-values lay under the diagonal line (or fewer hits than unadjusted model). In my model, after adjusting for clinical covariates, the QQ plot is way off the diagonal line and almost all the SNPs are false positives. I don't know how to explain this. One important thing is that some clinical covariates are very very significant in the clinical model (the model with no SNP information), and the p-value goes to 10E-11. I have 250 samples and 750K SNPs in my dataset.

Did you make your QQ plot of the test statistic/p-values associated with the SNP or with one of the covariates? — bdeonovic, Aug 21 '13 at 18:16
QQ plot is made for pvalue associated with SNPs, not covariates. — VincentShen, Aug 21 '13 at 19:56
Perhaps the test you are performing is a significance test of all variables in model (SNP + covariates)? Make sure the test is only testing whether the coefficient on the SNP variable is 0. — bdeonovic, Aug 21 '13 at 23:02
I am using GenABEL to run the GWAS. When I generated the QQ plot, if I used "Pc1df" instead of "P1df", then the plot looks fine. So, "Pc1df" should be the p-value I use for generating QQ plot? — VincentShen, Aug 22 '13 at 20:31
I think since August you found a solution. Can you tell us what the problem was? (you can write an answer and accept it) — Elvis, Nov 28 '13 at 19:51

score 1 · Answer 1 · answered Aug 30 '13 at 17:19

I don't know which p-values are appropriate in your case since I do not know what your data is like. For the Pc1df p-values I do not know what "corrected for possible inflation" means (population stratificaion perhaps?)

P1df: corresponding list of P-values of 1-d.f. (additive or allelic) test for association bestween SNP and trait

P2df: corresponding list of P-values of 2-d.f. (genotypic) test for association bestween SNP and trait

Pc1df:P-values from the 1-d.f. test for association bestween SNP and trait; the statistics is corrected for possible inflation

source: http://www.genabel.org/GenABEL/scan.gwaa-class.html

Inflated p-value after adjusting for covariates, in GWAS

1 Answers1

Linked