0

Is there any rule of thumb or optimization technique for number of grid points for Kernel Regression?

I am doing Nadaraya-Watson on 10 years data (2500 daily observations) of Swap rate. While performing cross-validation for optimal bandwidth selection, results are, that less grid points = higher optimal bandwidth, e.g. 50 points => h =10, 1000 points, h = 2. I was checking grids from list(range(50, 1001, 100)), where 50 is starting point, 1001 end point and 100 step. 50 means that grid consists of 50 points.

Alex
  • 1
  • What is the grid over exactly? Potential bandwidths? – einar Oct 16 '19 at 09:42
  • Hi. As a grid I am using ordinal dates. So I take max date, min date, make them ordinal (so it is something min = 73456 and max = 75800 as en example). And afterwards I create equidistant grid of 50 or 100....1000 points. Regarding bandwidth I used 20 (roughly corresponding to 20 days). But my calculation for optimal bandwidth suggest h =3 or 4. Though errors for bandwidths up to 10-11 are only 50% higher than min CV error. – Alex Oct 16 '19 at 11:17
  • And what do you use this ordinal date grid for? I ask these things because it is not entirely clear to me what your question is about. I think perhaps a code example might help. – einar Oct 16 '19 at 12:24
  • Sorry, code is too large. So I have Pandas DF, where index is dates, column is rates. My grid is dates converted to integers. It could be just [50,100...2450] or [73050, 73100...7540]. In Kernel function I calculate u = u = grid.T[:, None] - data.T[None, :]. Matrix is of shape n (rows is grid) by m (columns are data). I used h as 20 (20 observations). Though CV (leave-out) gives me h = 3 or 4 for granular grid and h=10-12 for 50 points grid. The problem is more granular grid = lower optimal bandwidth. Is there any rule to define grid granularity? – Alex Oct 16 '19 at 13:21

0 Answers0