3

I have datasets that can form several different curvy patterns between the dependent and independent variables. The 'true' relationship likely depends on a large number of factors that aren't easily measured and so it remains unknown. One method of describing this relationship has been through LOESS.

However, I am trying to optimize the LOESS fit by changing the tuning parameter (span or f-value) using k-fold cross-validation (with the loess.wrapper function in the bisoreg package in R). I mostly understand how the optimal span is chosen by minimizing the estimated predictive error via the CV. I've been studying the bias-variance tradeoff topic in sources such as Elements of Statistical Learning and I'm still left with a difficult question. Is it possible to calculate (or estimate) the bias and variance of a LOESS fit for different span values? I'd like to be able to compare different LOESS fits on the same data with these statistics but I think the issue lies in the fact that I don't know the 'true' relationship of the data.

Any guidance or recommended readings are appreciated.

zpsimpso
  • 31
  • 3

1 Answers1

3

I'm not an expert in this area at all, but I'm also interested in the question, so I did some digging.

First, It definitely is, possible to calculate the variance of a LOESS model, because such variance estimates are available from some packages, e.g. R's loess (see this post for a usage example).

The citation for that loess function states[1] that:

We can specify properties of the variances of the in one of two ways. The first is simply that they are a constant, $\sigma^2$. The second is that $a_i\epsilon_i$ has constant variance $\sigma^2$, where the a priori weights, $a_i\epsilon_i$, are positive and known.Howe

Meaning, I think, that you either have to assume that the error variance is globally constant, or that you know your weights a priori. However, the chapter doesn't actually describe the process for calculating the error variance in either situation, and I haven't yet found another reference that does...

References

  1. W. S. Cleveland, E. Grosse and W. M. Shyu (1992) Local regression models. Chapter 8 of Statistical Models in S eds J.M. Chambers and T.J. Hastie, Wadsworth & Brooks/Cole.
naught101
  • 4,973
  • 1
  • 51
  • 85