In Section 3.6.1 of "Applied Multiple Regression/ Correlation Analysis for the Behavioral Sciences" J. Cohen writes that the standard error of a regression coefficient is:
$SE_{B_i} = \frac{\sigma_Y}{\sigma_i}\sqrt{\frac{1-R_Y^2}{(1-R_i^2)\cdot DOF}}$
$B_i$ is the ith regression coefficient, $Y$ is the dependent variable, $R_Y$ the multiple correlation coefficient, DOF is the number of degrees of freedom. I assume that $R_i$ is the multiple correlation coefficient of the ith IV with all other IVs (i.e. in the case of two IVs simply the correlation coefficient between the IVs). Unfortunately the book isn't very clear about the definition of $R_i$, or it is buried in the text of one of the preceding chapters.
I want to check this with the following GLM without intercept:
$Y_{MEDV} = B_{NOX}\cdot X_{NOX} + B_{RAD}\cdot X_{RAD}$
Here $MEDV$, $NOX$ and $RAD$ refer to the variables in the Boston House Prices Dataset from sklearn in Python, and are the median value of homes (MEDV), the nitric oxide concentration (NOX), and the accessibility of radial highways (RAD).
This is the code in Python [source]:
import statsmodels.api as sm
from sklearn import datasets
data = datasets.load_boston()
import numpy as np
import pandas as pd #define the data/predictors as the pre-set feature names
df = pd.DataFrame(data.data, columns=data.feature_names)
# Put the target (housing value -- MEDV) in another DataFrame
target = pd.DataFrame(data.target, columns=["MEDV"])
X = df[['NOX', 'RAD']]
y = target['MEDV']
model = sm.OLS(y, X).fit()
model.summary() # GLM result
r_noxrad = np.corrcoef(df['NOX'],df['RAD'])[0,1] # correlation between the two IVs
Rsquared = 0.803 # This comes from the GLM result above.
dof = y.size - 2 # No intercept
# Compute Standard Errors (to compare with GLM result)
std_err_nox = target['MEDV'].std()/ df['NOX'].std() * np.sqrt((1-Rsquared)/((1-r_noxrad**2)*dof))
std_err_rad = target['MEDV'].std()/ df['RAD'].std() * np.sqrt((1-Rsquared)/((1-r_noxrad**2)*dof))
The statsmodels package computes $R = 0.876$, and the standard errors on the coefficients $SE_{B_{NOX}}=1.440$ and $SE_{B_{RAD}}=0.063$. However, my code above works out to $\text{std_err_nox} = 1.983$ and $\text{std_err_rad} = 0.026$.
What am I getting wrong?