Some time ago, reading Advanced Data Analysis from an Elementary Point of View by Cosma Rohilla Shalizi, I found this statement in a section about significant coefficients:
Moreover, at a fixed sample size, the coefficients with smaller standard errors will tend to be the ones whose variables have more variance, and whose variables are less correlated with the other predictors. High input variance and low correlation help us estimate the coefficient precisely, but, again, they have nothing to do with whether the input variable actually influences the response a lot.
While in a simple regression setting, that is clear from the formula of the standard error of the estimator $ \hat{\beta} $:
$${\displaystyle s_{\hat {\beta }}={\sqrt {\frac {{\frac {1}{n-2}}\sum _{i=1}^{n}{\hat {\varepsilon }}_{i}^{\,2}}{\sum _{i=1}^{n}(x_{i}-{\bar {x}})^{2}}}}},$$
moving on a multivariate OLS, I cannot see why the same is true from the corresponding formula of the estimated standard error of (generic) $\hat{\beta_j}$:
$${\displaystyle {\widehat {\operatorname {s.\!e.} }}({\hat {\beta }}_{j})={\sqrt {s^{2}(X^{T}X)_{jj}^{-1}}}}.$$
Making some tests with matrix inversion I found that it is generally true, and that can be maybe seen from the Cramer's rule $\displaystyle A^{-1}={\frac {1}{\operatorname {det} (A)}}\operatorname {adj} (A)$.
Can someone provide me with some insights on this?