A genome-wide association study (GWA study), also known as whole genome association study (WGA study or WGAS), is an examination of many common genetic variants in different individuals to see whether any variant is associated with a trait. [Wikipedia]
Questions tagged [gwas]
31 questions
25
votes
1 answer
In genome-wide association studies, what are principal components?
In genome-wide association studies (GWAS):
What are the principal components?
Why are they used?
How are they calculated?
Can a genome-wide association study be done without using PCA?

suprvisr
- 643
- 2
- 8
- 14
12
votes
1 answer
How do children manage to pull their parents together in a PCA projection of a GWAS data set?
Take 20 random points in a 10,000-dimensional space with each coordinate iid from $\mathcal N(0,1)$. Split them into 10 pairs ("couples") and add the average of each pair ("a child") to the dataset. Then do PCA on the resulting 30 points and plot…

amoeba
- 93,463
- 28
- 275
- 317
7
votes
1 answer
Case-control study and Logistic regression
Suppose we have case-control data, where cases have some disease ($Y$) and controls don't and we are interested in the association of some other variable(s) ($X$). I know that in this scenario we cannot use the disease as the response variable…

bdeonovic
- 8,507
- 1
- 24
- 49
4
votes
1 answer
Inflated p-value after adjusting for covariates, in GWAS
I am working on some GWAS (Genome-Wide Association Studies) now. A genome scan was done for all the SNPs, with first 3 principal components adjusted (PCs are used for adjusting ethnicity effect) and the QQ plot looks fine (most p-values lay on the…

VincentShen
- 41
- 2
4
votes
1 answer
Deflated QQ plots in genome-wide association studies
I am working on a GWAS dataset containing 920 individuals with genotype information on ~1.5M SNPs (genotyped on Illumina 2.5omni chip; no imputed SNPs). I am testing several different phenotypes in this dataset, but everytime I run an association…

motu
- 41
- 1
- 4
4
votes
1 answer
Including covariates makes QQ plot worse
I know this is similar to this question, but that one didn't seem to get a satisfactory answer.
I'm using plink to run a GWAS. My phenotype data are binary, so it's performing a logistic regression for each genetic variant. I'm checking the data by…

njc
- 75
- 6
3
votes
1 answer
Principal components as covariates in a linear model
I'm working with some genetics data, performing linear regressions, and have been advised to control for population structure by performing principal components analysis. My model at the moment is of the structure:
Minor Allele Frequency ~ Age +…

Erika_Hammerl246
- 110
- 9
3
votes
0 answers
Fisher's method for P-value more conservative than OLS?
I am currently observing an interesting phenomenon in my analysis. I have a simple logistic regression model for independent Inds. The model is as follows:
$$\operatorname{logit}(Y) = \beta_0+\beta_1X $$
My data $X$ can be stratified by gender (2…

Adam
- 31
- 3
2
votes
0 answers
How to estimate the phenotypic variation explained by top SNPs from a GWAS study?
I have conducted a large-scale GWAS study and got a few significantly associated SNPs. I used GEMMA with -lmm 1 options to run the GWAS and obtain the beta and standard-error estimates. I want to estimate the percent phenotypic variation explained…

Anik Dutta
- 21
- 1
2
votes
0 answers
multivariate cox regression - combining genotypes for each SNP
For 53 SNP I have coded genotypes as 0 for aa, 1 for ab, and 2 for bb for 29 samples and I have outcome and time to outcome. Here outcome is called "BCR" below.
I am using the survival package in R to run a cox regression analysis on 7/53…

user3324491
- 157
- 1
- 4
2
votes
0 answers
How to analyse GWAS data of co-morbid disorders?
Lets say I am running a GWAS for a disease condition O , where all my cases has one or more of the following diseases: A, B, and C, in addition to my outcome of interest O. So I include individuals with condition A, B and C in my control group as…

Veera
- 519
- 1
- 5
- 14
2
votes
0 answers
Meta-analys of GWAS and interpretation of estimated p-values
I'm a bit confused about a meta-analysis performed with PLINK on two studies.
The statistics look like this for a specific SNP in each study:
Study_1: OR = 3.657, SE = 0.336, p= 0.0001137
Study_2: OR = 9.08E-009, SE= 8363.831118, p= 0.998
Using…

eXpander
- 524
- 1
- 6
- 15
1
vote
0 answers
Is there a way to adjust FDR threshold using correlation across tests (aka, number of estimated independent tests, like in Bonferroni correction)
For Bonferroni correction, correlations can be used to inform correction of multiple testing using the spectral decomposition of matrices. For example, in GWAS, the threshold is established to be 5e-8 after this kind of correction (account to the…

Albert Ying
- 13
- 3
1
vote
1 answer
Use of Bayesian Regression and the Multiple Testing problem
In many association studies (e.g., GWAS), a large number of Linear Regression models are fitted. Then, a strategy to account for the Multiple Testing issue is adopted (e.g., Bonferroni). That being said, it is clear now that it's not that easy, and…

wrong_path
- 607
- 6
- 20
1
vote
0 answers
Expected effect of each single-nucleotide polymorphism (SNP) in a genome-wide association study (GWAS)
It was mentioned in a genetics class that in a genetic association analyses of a trait with all SNPs, it is possible to compute the expected effect of each SNP with the trait using the correlation structure between the SNPs and the size of the…

Richard
- 19
- 2