12

I would like to find out the values (x, y) used in plotting plot(b, seWithMean=TRUE) in mgcv package. Does anyone know how I can extract or compute these values?

Here is an example:

library(mgcv) 
set.seed(0)
dat <- gamSim(1, n=400, dist="normal", scale=2) 
b   <- gam(y~s(x0), data=dat) 
plot(b, seWithMean=TRUE)
gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • I'm not familiar with `gam` models, but have you examined the different attributes of that object? You can look at the names of the objects with `names(b)`. I'm guessing whatever details you are after will be retained within that object somewhere. – Chase Mar 02 '11 at 13:56

4 Answers4

21

Starting with mgcv 1.8-6, plot.gam invisibly returns the data it uses to generate the plots, i.e. doing

pd <- plot(<some gam() model>)

gives you a list with the plotting data in pd.


ANSWER BELOW FOR mgcv <= 1.8-5:

I've repeatedly cursed the fact that the plot functions for mgcv don't return the stuff they are plotting -- what follows is ugly but it works:

library(mgcv) 
set.seed(0)
dat <- gamSim(1, n = 400, dist = "normal", scale = 2)
b <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat)

plotData <- list()
trace(mgcv:::plot.gam, at = list(c(27, 1)), 
  ## tested for mgcv_1.8-4. other versions may need different at-argument.
  quote({
    message("ooh, so dirty -- assigning into globalenv()'s plotData...")
    plotData <<- pd
    }))
mgcv::plot.gam(b, seWithMean = TRUE, pages = 1)

par(mfrow = c(2, 2))
for (i in 1:4) {
  plot(plotData[[i]]$x, plotData[[i]]$fit, type = "l", xlim = plotData[[i]]$xlim,
    ylim = range(plotData[[i]]$fit + plotData[[i]]$se, plotData[[i]]$fit -
      plotData[[i]]$se))
  matlines(plotData[[i]]$x, cbind(plotData[[i]]$fit + plotData[[i]]$se, 
    plotData[[i]]$fit - plotData[[i]]$se), lty = 2, col = 1)
  rug(plotData[[i]]$raw)  
}
fabians
  • 2,616
  • 15
  • 19
  • Many thanks for your help. When I reproduce your code up to `plotData < –  Mar 02 '11 at 23:15
  • The "trace" trick used to work for me. However, it recently failed me. I suspect it has to do with a new version of mgcv package (I'm currently using v 1.8-3), which may require a different "at" argument in the trace function. Can anyone help me on how to get the correct vector for the "at" argument of the trace function? Many thanks in advance! –  Jan 16 '15 at 08:04
  • @Pepijn see my edit. – fabians Jan 16 '15 at 11:15
4

Package visreg can make effect plots similar to GAM (but perhaps not identical?) and does give the plot components as output as well, formatted as a list. Using plyr one can make a dataframe of the output. Example:

plot <- visreg(model, type = "contrast")
smooths <- ldply(plot, function(part)   
  data.frame(x=part$x$xx, smooth=part$y$fit, lower=part$y$lwr, upper=part$y$upr))
user13380
  • 161
  • 2
3

This will not be a complete answer. All the plotting for gam objects is being done with function plot.gam. You can look at its code by simply typing

> plot.gam

in R console. As you will see the code is huge. What I've gleaned from it, that all the plotting is done by gathering relevant information in pd object which is a list. So one of the possible solution would be to edit plot.gam, using edit for example, so that it returns that object. Adding pd before last } will be enough. I would advise adding invisible(pd), so that this object is returned only if you ask for it:

> pd <- plot(b,seWithMean = TRUE)

Then inspect this object and search in the code of plot.gam for the lines with plot and lines. Then you will see which of the relevant x and y values appear in the plot.

mpiktas
  • 33,140
  • 5
  • 82
  • 138
  • oops, I didn't see yours when I posted my answer. Well, it's a little more detailed anyway.... – fabians Mar 02 '11 at 15:16
  • @fabians, no worries, I would not have posted mine if I saw yours. I outlined general idea, you have provided the code. Since the question asks for code, your answer is better. – mpiktas Mar 02 '11 at 15:20
0
## And this is the code for multiple variables!
require(mgcv)
n      = 100
N      = n
tt     = 1:n
arfun  = c(rep(.7,round(n/3)),rep(.3,round(n/3)),rep(-.3,ceiling(n/3)))
arfun2 = c(rep(.8,round(n/3)),rep(.3,round(n/3)),rep(-.3,ceiling(n/3)))
int    = .1*(tt-mean(tt))/max(tt)-.1*((tt-mean(tt))/(max(tt)/10))^2
y      = rep(NA,n)
s.sample <- N
x        <- 10*rnorm(s.sample)
z        <- 10*rnorm(s.sample)
for(j in 1:n){
  y[j]=int[j]+x[j]*arfun[j]+z[j]*arfun2[j]+rnorm(1)  
}

mod = gam(y ~ s(tt) + s(tt, by=x) + s(tt, by=z)) 
## getting the data out of the plot
plotData <- list()
trace(mgcv:::plot.gam, at=list(c(25,3,3,3)),
      # this gets you to the location where plot.gam calls 
      #    plot.mgcv.smooth (see ?trace)
      # plot.mgcv.smooth is the function that does the actual plotting and
      # we simply assign its main argument into the global workspace
      # so we can work with it later.....

      quote({
        # browser()
        print(pd)
        plotData <<- c(plotData, pd)
      }))

# test: 
mgcv::plot.gam(mod, seWithMean=TRUE)


# see if it succeeded
slct = 3
plot(plotData[[slct]]$x, plotData[[slct]]$fit, type="l", xlim=plotData$xlim, 
     ylim=range(plotData[[slct]]$fit + plotData[[slct]]$se, plotData[[slct]]$fit - 
                plotData[[slct]]$se))
matlines(plotData[[slct]]$x, 
         cbind(plotData[[slct]]$fit + plotData[[slct]]$se, 
               plotData[[slct]]$fit - plotData[[slct]]$se), lty=2, col=1)
rug(plotData[[slct]]$raw)
gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Lauie
  • 1
  • 1