I am trying to understand how to get the coefficient of a multiple linear regression.
The formula is:
$b = (X'X)^{-1}(X')Y$
I try to calculate $b$ without package and with the lm
package inside R.
Doing so, I got different results.
I want to know why. Did I made a mistake? Or does the lm
package calculate differently because of the intercept?
> y <- c(1,2,3,4,5)
> x1 <- c(1,2,3,4,5)
> x2 <- c(1,4,5,7,9)
> Y <- as.matrix(y)
> X <- as.matrix(cbind(x1,x2))
> beta = solve(t(X) %*% X) %*% (t(X) %*% Y) ; beta
[,1]
x1 1.000000e+00
x2 -1.421085e-14
> model <- lm(y~x1+x2) ; model$coefficients
(Intercept) x1 x2
1.191616e-15 1.000000e+00 1.192934e-15
Update
As Alex and the other told me, it was a question of roundoff error. Therefore, I decided to take another data from the book "Essential Statistics for business and economics" by Anderson and all. In this case, the coefficients are the same in both lm
function and in my own matrix.
> y <- c(9.3, 4.8, 8.9, 6.5, 4.2, 6.2, 7.4, 6, 7.6, 6.1)
> x0 <- c(1,1,1,1,1,1,1,1,1,1)
> x1 <- c(100,50,100,100,50,80,75,65,90,90)
> x2 <- c(4,3,4,2,2,2,3,4,3,2)
> Y <- as.matrix(y)
> X <- as.matrix(cbind(x0,x1,x2))
> beta = solve(t(X) %*% X) %*% (t(X) %*% Y);beta
[,1]
x0 -0.8687015
x1 0.0611346
x2 0.9234254
> model <- lm(y~+x1+x2) ; model$coefficients
(Intercept) x1 x2
-0.8687015 0.0611346 0.9234254