2

I have conducted a large-scale GWAS study and got a few significantly associated SNPs. I used GEMMA with -lmm 1 options to run the GWAS and obtain the beta and standard-error estimates. I want to estimate the percent phenotypic variation explained by each of the significant SNPs. I used the following procedure for estimating the variance explained in R:

fit <- lm (Phenotypic_value ~ SNP_data, data = a)
summary(fit)$r.squared

Here, the datafile a contains three columns namely, sample_ID, Phenotypic_value for each sample, and the biallelic SNP_data. I got a value which is 0.43 meaning 43% phenotypic variation explained by the SNP.

Again, I used another formula which is: 2*f*(1-f)*b.alt^2. Here, f is the minor allele frequency and b.alt is the effect size i.e. beta estimate obtained from GEMMA. This gives me a value of 0.03 meaning 3% variation explained which seems reasonable to me.

My question is that which of the following method is correct? or Is there any other way to estimate the percent variation explained?

Alternatively, from the GEMMA google group, I have got this formula pve <- var(x) * (beta^2 + se^2)/var(y). But I do not understand how can I obtain the value of var(x) and var(y).

It will be great to receive some feedback on this. Thank you.

Anik Dutta
  • 21
  • 1

0 Answers0