I am using a linear regression, yet the only output I need is the Sum of Squared Residuals (SSR), I don't care about the coefficients. (Context is a non-linear LS, which is linear given an extra parameter, so I am running a grid search of least-squares SSR)
Is there a fast algorithm to extract the SSR only, with the QR (or other) decomposition? In matrix form:
$$SSR = Y'PY = Y'(I - X(X'X)^{-1}X')Y$$ and with the QR decomposition, $X = QR$, one has $X(X'X)^{-1}X'=QQ'$, which suggests one does not need to do a matrix inversion at all, though on the other side, one needs to grow a $n \times n$ matrix... SO:
Is there a fast algorithm to obtain the SSR only, or am I better sticking with usual function, that compute beta, and also the residuals
Is there such an implementation in R, (or linpack/fortran, called by R?). I saw
qr.resid()
, but it seems to be slower than thelm.fit()
, that returns everything? See benchmark below (changes when X is larger though).
Thanks!
data(EuStockMarkets)
X <- EuStockMarkets[1:100, 1:3]
Y <- EuStockMarkets[1:100, 4]
library(microbenchmark)
microbenchmark(lm.fit = c(crossprod(lm.fit(X, Y)$residuals)),
.lm.fit = c(crossprod(.lm.fit(X, Y)$residuals)),
qr_res = c(crossprod(qr.resid(qr(X), Y))),
times = 10)
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> lm.fit 44.566 45.704 97.3347 47.0335 51.143 543.285 10
#> .lm.fit 11.144 11.995 15.0783 13.1285 13.856 35.594 10
#> qr_res 49.495 51.841 79.9043 53.4505 55.014 317.989 10
Created on 2018-10-20 by the reprex package (v0.2.1)