9

I want to calculate the hat matrix directly in R for a logit model. According to Long (1997) the hat matrix for logit models is defined as:

$$H = VX(X'VX)^{-1} X'V$$

X is the vector of independent variables, and V is a diagonal matrix with $\sqrt{\pi(1-\pi)}$ on the diagonal.

I use the optim function to maximize the likelihood and derive the hessian. So I guess my question is: how do i calculuate $V$ in R?

Note: My likelihood function looks like this:

loglik <-  function(theta,x,y){
y <- y
x <- as.matrix(x)
beta <- theta[1:ncol(x)]
loglik <- sum(-y*log(1 + exp(-(x%*%beta))) - (1-y)*log(1 + exp(x%*%beta)))
return(-loglik)
}

And i feed this to the optim function as follows:

logit <- optim(c(1,1),loglik, y = y, x = x, hessian = T)

Where x is a matrix of independent variables, and y is a vector with the dependent variable.

Note: I know that there are canned procedures for doing this, but I need to do it from scratch

Thomas Jensen
  • 1,033
  • 1
  • 12
  • 22
  • 3
    In what way are you using **optim** (with what options, with or without supplying a gradient function, etc)?? Logistic regression is a smooth convex problem. It's readily solved using Newton's method or similar. In fact, to get an estimate of the covariance matrix, you *need* to do (something close to) this. – cardinal Mar 10 '11 at 14:41
  • I have added the info to the post – Thomas Jensen Mar 10 '11 at 14:51

1 Answers1

13

For logistic regression $\pi$ is calculated using formula

$$\pi=\frac{1}{1+\exp(-X\beta)}$$

So diagonal values of $V$ can be calculated in the following manner:

pi <- 1/(1+exp(-X%*%beta))
v <- sqrt(pi*(1-pi))

Now multiplying by diagonal matrix from left means that each row is multiplied by corresponding element from diagonal. Which in R can be achieved using simple multiplication:

VX <- X*v 

Then H can be calculated in the following way:

H <- VX%*%solve(crossprod(VX,VX),t(VX))

Note Since $V$ contains standard deviations I suspect that the correct formula for $H$ is

$$H=VX(X'V^2X)^{-1}X'V$$

The example code works for this formula.

mpiktas
  • 33,140
  • 5
  • 82
  • 138
  • Thanks mpiktas, but I am somewhat stuck at how to calculate V. Is V simply the diagonal of the covariance matrix? – Thomas Jensen Mar 10 '11 at 14:38
  • @Thomas, no, it's the diagonal matrix as you've specified it in your initial post, but where the $\pi_i$ are replaced by the estimates $\hat{\pi}_i$, i.e., the estimated probability that the $i$th response is 1 under the model. – cardinal Mar 10 '11 at 14:57
  • Ok, so for each row in the data I simply calculate the predicted probability, and multiply the square root of this vector with the matrix of independent variables? – Thomas Jensen Mar 10 '11 at 15:03
  • @Thomas, yes, that is how it is done in my code. You can check with dummy example that it really works. – mpiktas Mar 10 '11 at 15:05
  • 2
    @mpiktas - you are right about $V^2$. Effectively what you are doing is "standardising" the $X$ matrix, and $Y$ vector, then doing weighted least squares on the standardised variables, then backtransforming to original scale. You need to iterate because the standardisation depends on $\beta$ – probabilityislogic Mar 10 '11 at 23:14