What effect does data averaging have on the variogram? To be specific, please see a simple example:
#Simulate a pure random walk data set, call this *PROCESS A*
n <- 1000 #number of data points
t <- 1:n #time
y <- cumsum(rnorm(n)) #data points
# Averaging at every 2 lags, call this *PROCESS B*
t2 <- apply(matrix(t, nrow=n/2, byrow=TRUE), 1, mean)
y2 <- apply(matrix(y, nrow=n/2, byrow=TRUE), 1, mean)
# Compute and plot the empirical variogram
require(geoR)
var <- as.geodata(data.frame(coords = t(rbind(t, rep(1, n))), data.col=y))
vario <- variog(var1)
var2 <- as.geodata(data.frame(coords = t(rbind(t2, rep(1, n/2))), data.col=y2))
vario2 <- variog(var2)
plot(vario$u, vario$v, type='b', pch=16, cex=.7)
points(vario2$u, vario2$v, col=2, type='b', pch=16, cex=.7)
Theoretically $\gamma(h) = \frac{1}{2} Var(y(t) - y(t+h))$ and for a linear variogram of a simple process without the nugget effect this is $\frac{1}{2}\sigma^2 |h|$ where $\sigma^2$ is the variance of the underlying process. Thus, we expect (1) the two variogram are parallel and (2) the proportion of the nugget effect of PROCESS A to PROCESS B is 4/3. Compare the plots and the Math: (1) is confirmed, but (2) is not. Why? Is this a simple example of the 'change of support' problem in geostatistics?? Thanking you in advance.