0

I am attempting to generate a probability map based on spatio-temporal observations where the response is 0 or 1. I use R to do the following

I first attempted to use the bam function from the mgcv package. I formula looks like this : gam.fit = bam(Y ~ s(X1) + s(X2) + s(X3) + te(LATITUDE,LONGITUDE,k=25), data=data.df, family=binomial) where X1, X2 and X3 are covariates that have some spatial correlation. I actually have about 15 covariates, some of them being factors and one of them being the day of the year. When I mapped the deviance residuals, there were still some clear spatial patterns, hence my model is not good enough.

I thought of applying a model on residuals(gam.fit) but I'm getting confused with all the logit transformations and I'm not sure how I could end up with a "combined residual map".

data.df$res = residuals(gam.fit,type='deviance')
gam.fit_res = bam(res ~ te(LATITUDE,LONGITUDE,k=25), data=data.df)
y1 = predict(gam.fit,data.df)
y2 = predict(gam.fit_res,data.df)
mu = binomial(link='logit')$linkinv(y1+y2)
total_res = binomial(link='logit')$dev.resids(data.df$Y, mu,rep(1,length(mu)))

` I end up with something that doesn't make much sense, with residuals that are all positive and still show a clear spatial pattern.

Am I doing something wrong? Are there any other ways I could correct my spatial patterns?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
jgadoury
  • 121
  • 6
  • Interesting stuff. I don't think I can offer assistance with your problem, however, I am curious as to where the "temporal" component is in your code. I see lat and lon, but no time. – Jon Oct 20 '16 at 17:18
  • @Jon The day of the year is a covariate, I edited my question accordingly. – jgadoury Oct 20 '16 at 17:20
  • 1
    So your goal is to apply a residual map. What is the map supposed to convey? The thing about logistic regression residuals aren't like OLS residuals. I know geospatial analyts wil plot out the OLS residuals to look for clusters/patterns, but I do not think this same approach will benefit you when using logistic regression – Jon Oct 20 '16 at 17:42
  • http://web.pdx.edu/~newsomj/da2/ho_logistic.pdf – Jon Oct 20 '16 at 17:42
  • What is meant by "combined residual map"? Also, it should be noted that the logistic regression will give you log(odds), or with appropriate transformation, probabilities(y = 1 | X). If your data is dichotomous, 1 - P(Y = 1|X) as is given with `residuals.glm(..., type = "response")` doesn't provide you with anything useful since Y is usually a factor, not a probability. I'm mostly familiar with using deviance residuals for the case of assessing goodness of it; I'm not sure it can be applied in the same manner as OLS residuals. – Jon Oct 20 '16 at 17:53
  • Here's a post that might be helpful: http://stats.stackexchange.com/questions/1432/what-do-the-residuals-in-a-logistic-regression-mean – Jon Oct 20 '16 at 17:53

0 Answers0