Following the discussion :
Here's [a link] How are the standard errors of coefficients calculated in a regression?
I've got a question . Why variance of theta (regression coefficients) in theory is calculated by multiplying sigma^2 and (XT * X) ^ -1. Where sigma^2 is variance of the residuals : SSE / n-1 ,(where n- number of samples in the training set). However, in R it is computed as MSE = SSE / (n-p-1) (where p number of features) multiplied by the same matrix (XT * X) ^ -1. What is the reason? Many thanks in advance.