Why would one use age-squared as a covariate in a genetic association study?

Question

Why would one use age and age-squared as covariates in a genetic association study? I can understand the use of age if it has been identified as a significant covariate, but I am at a loss as to the use of age-squared.

Are you looking for a domain-specific answer, or a general answer of why this kind of thing is done in a linear model? Non-domain-wise, I believe it's common to have age and age-squared in survival-type studies to model the relatively linear failure rate during a subject's prime years, followed by a rapidly increasing failure rate as the subject reaches "old age". Would this apply in a genetic association study if some characteristic was associated with old age? — Wayne, Dec 14 '11 at 17:51
Thanks for the responses! An example would be, candidate gene population-based association studies with bone mineral density, a quantitative trait that is a risk factor for osteoporosis and yes, it is a characteristic associated with ageing. — Kevin, Dec 15 '11 at 14:51
Related question: [Why is the 'age squared' variable divided by 100 or 1000?](http://stats.stackexchange.com/questions/165027/why-is-the-age-squared-variable-divided-by-100-or-1000) — Silverfish, Aug 08 '15 at 13:36

score 10 · Answer 1 · answered Jan 13 '12 at 17:13

Taylor series approximations tell us that pretty much any smooth function can be approximated by a polynomial, so including terms like $x^2$ or $x^3$ (where x is age for your example) let us estimate the coefficients for the approximation for a known or unknown non-linear function of $x$, or age in your case. Testing these coefficients is also a simple way to test if the relationship is reasonably linear or if non-linear terms will give a better fit.

Depending on the ultimate goal of the analysis the non-linear terms can be kept for prediction, or plots of the prediction can be used to suggest the actual functional relationship. There are other tools, such as cubic splines, that can be used instead of polynomial terms to accomplish similar goals, but adding a squared term is a quick and easy way to do this.

score 2 · Answer 2 · answered Nov 17 '13 at 13:02

Keeping it simple: adding the square of the variable allows you to model more accurately the effect of age, which may have a non-linear relationship with the independent variable. For instance, the effect of age could be positive up until, say, the age of 50, and then negative thereafter.

Adding the age squared to age, allows you to model the effect a differing ages, rather than assuming the effect is linear for all ages.

See my blog post for a simple step by step guide and how to interpret the age & age squared variable.

http://www.excel-with-data.co.uk/blog-1/how-to-regression-analysis-in-excel/

score 1 · Answer 3 · answered Dec 14 '11 at 16:11

1

It might be possible that a transformation was made in order to satisfy model assumptions. It may have also been done because of the presence of some sort of quadratic relationship.

answered Dec 14 '11 at 16:11

StatsStudent

10,205
4
37
68

Why would one use age-squared as a covariate in a genetic association study?

3 Answers3

Linked