19

note: with no correct answers after a month, I have reposted to SO

Background

I have a model, $f$, where $Y=f(\textbf{X})$

$\textbf{X}$ is an $n \times m$ matrix of samples from $m$ parameters and $Y$ is the $n \times 1$ vector of model outputs.

$f$ is computationally intensive, so I would like to approximate $f$ using a multivariate cubic spline through $(X,Y)$ points, so that I can evaluate $Y$ at a larger number of points.

Question

Is there an R function that will calculate an arbitrary relationship between X and Y?

Specifically, I am looking for a multivariate version of the splinefun function, which generates a spline function for the univariate case.

e.g. this is how splinefun works for the univariate case

x <- 1:10
y <- runif(10)
foo <- splinefun(x,y)
foo(1:10) #returns y, as example
all(y == foo(1:10))
## TRUE

What I have tried

I have reviewed the mda package, and it seems that the following should work:

library(mda)
x   <- data.frame(a = 1:10, b = 1:10/2, c = 1:10*2)
y   <- runif(10)
foo <- mars(x,y)
predict(foo, x) #all the same value
all(y == predict(foo,x))
## FALSE

but I could not find any way to implement a cubic-spline in mars

update since offering the bounty, I changed the title - If there is no R function, I would accept, in order of preference: an R function that outputs a gaussian process function, or another multivariate interpolating function that passes through the design points, preferably in R, else Matlab.

David LeBauer
  • 7,060
  • 6
  • 44
  • 89

3 Answers3

12

This paper presented at UseR! 2009 seems to address a similar problem

http://www.r-project.org/conferences/useR-2009/slides/Roustant+Ginsbourger+Deville.pdf

It suggests the DiceKriging package http://cran.r-project.org/web/packages/DiceKriging/index.html

In particular, check the functions km and predict.

Here is a an example of three dimensional interpolation. It looks to be straightforward to generalise.

x <- c(0, 0.4, 0.6, 0.8, 1)
y <- c(0, 0.2, 0.3, 0.4, 0.5)
z <- c(0, 0.3, 0.4, 0.6, 0.8)

model <- function(param){
2*param[1] + 3*param[2] +4*param[3]
}


model.in <- expand.grid(x,y,z)
names(model.in) <- c('x','y','z')

model.out <- apply(model.in, 1, model)

# fit a kriging model 
m.1 <- km(design=model.in, response=model.out, covtype="matern5_2")

# estimate a response 
interp <- predict(m.1, newdata=data.frame(x=0.5, y=0.5, z=0.5), type="UK",    se.compute=FALSE)
# check against model output
interp$mean
# [1]  4.498902
model(c(0.5,0.5,0.5))
# [1] 4.5

# check we get back what we put in
interp <- predict(m.1, newdata=model.in, type="UK", se.compute=FALSE)
all.equal(model.out, interp$mean)
# TRUE
Tony Ladson
  • 668
  • 7
  • 14
6

You need more data for a spline fit. mgcv indeed is a good choice. For your specific request you need to set the cubic spline as the basis function bs='cr' and also not have it penalized with fx=TRUE. Both options are set for a smooth term that is set with s(). Predict works as expected.

library(mgcv)
x <- data.frame(a = 1:100, b = 1:100/2, c = 1:100*2)
y <- runif(100)
foo <- gam(y~a+b+s(c,bs="cr",fx=TRUE),data=x)
plot(foo)
predict(foo,x)
Alex
  • 1,047
  • 7
  • 12
  • Thank you for your help, but if this were a cubic spline, shouldn't I expect `predict(foo,x)` to return `y`? – David LeBauer Jul 28 '11 at 21:57
  • Sorry, didn't notice that you want a perfect approximation. Then apparently mgcv is not of much help: stop("Basis only handles 1D smooths") (from https://svn.r-project.org/R-packages/trunk/mgcv/R/smooth.r) – Alex Jul 29 '11 at 00:38
0

You give no details as to the form of the function $f(X)$; it might be that a piecewise constant function is a sufficiently good approximation, in which case you might want to fit a regression tree (with package rpart for instance). Otherwise, you might want to look at package earth, in addition to what has been suggested already.

F. Tusell
  • 7,733
  • 19
  • 34
  • 2
    the form of $f(X)$ is beyond the scope of this problem, but it is a [dynamic global vegetation model](http://en.wikipedia.org/wiki/Dynamic_global_vegetation_model) described in the appendix of [Medvigy et al 2009](http://www.oeb.harvard.edu/faculty/moorcroft/publications/publications/Medvigy_etal_2009.pdf) – David LeBauer Aug 23 '11 at 04:18