I am considering the first fifteen waves of the British Household Panel Survey data. I wished to know the intuition behind using age squared/1000 as one of the variables in the published papers. How do I interpret this?
Thank you!
I am considering the first fifteen waves of the British Household Panel Survey data. I wished to know the intuition behind using age squared/1000 as one of the variables in the published papers. How do I interpret this?
Thank you!
Age squared results in often fairly large values for the variable (e.g., $60^2=3600$), with correspondingly small estimated coefficients - an increase in age squared by one really is a small change in age, so the effect of that change as measured by the coefficient should be very small.
These small coefficients may be hard to print in output, and hard to interpret (but coefficients in regressions with polynomials are always hard to interpret). The adjustment just scales up the estimated coefficients.
In general, we have the following result:
Consider transforming $X $ by some invertible $k\times k$ matrix $A$, $XA$ (e.g., change months of schooling to years and meters to centimeters when explaining wages).
Then, see what happens to the estimated coefficients: \begin{align*} \hat{\beta}^\circ&=\bigl(\underbrace{A'X'}_{``X'"}\underbrace{XA}_{``X"}\bigr)^{-1}\underbrace{A'X'}_{``X'"}y\\ &=A^{-1}(X'X)^{-1}(A')^{-1}A'X'y\\ &=A^{-1}(X'X)^{-1}X'y\\ &=A^{-1}\hat{\beta} \end{align*} That is, if $$ A=\begin{pmatrix} 1/12&0\\ 0&100 \end{pmatrix}\qquad\text{so that}\qquad A^{-1}=\begin{pmatrix} 12&0\\ 0&1/100 \end{pmatrix} $$ in the above example, the effect of a change in the regressors is, sensibly, adjusted accordingly. Another year of education yields 12 times more additional wage than another month, and another cm has only $1/100$ of the effect on wages as another meter.
Here is a little numerical example in R
to illustrate the point (which I hope I could clarify is not intrinsically related to squared regressors):
X <- rnorm(20)
y <- rnorm(20)
summary(lm(y~X))$coefficients[2]
[1] -0.03610936
X <- X*10
summary(lm(y~X))$coefficients[2]
[1] -0.003610936