Introduction and Summary
Tobler's Law of Geography asserts
Everything is related to everything else, but near things are more related than distant things.
Kriging adopts a model of those relationships in which
"Things" are numerical values at locations on the earth's surface (or in space), usually represented as a Euclidean plane.
These numerical values are assumed to be realizations of random variables.
"Related" is expressed in terms of the means and covariances of these random variables.
(A collection of random variables associated with points in space is called a "stochastic process.") The variogram provides the information needed to compute those covariances.
What Kriging Is
Kriging specifically is the prediction of things at places where they have not been observed. To make the prediction process mathematically tractable, Kriging limits the possible formulas to be linear functions of the observed values. That makes the problem a finite one of determining what the coefficients should be. These can be found by requiring that the prediction procedure have certain properties. Intuitively, an excellent property is that the differences between the predictor and the true (but unknown) value should tend to be small: that is, the predictor should be precise. Another property which is highly touted but is more questionable is that on average the predictor should equal the true value: it should be accurate.
(The reason that insisting on perfect accuracy is questionable--but not necessarily bad--is that it usually makes any statistical procedure less precise: that is, more variable. When shooting at a target would you prefer to scatter the hits evenly around the rim and rarely hitting the center or would you accept results that are focused just next to, but not exactly on, the center? The former is accurate but imprecise while the latter is inaccurate but precise.)
These assumptions and criteria--that means and covariances are appropriate ways to quantify relatedness, that a linear prediction will work, and that the predictor should be as precise as possible subject to being perfectly accurate--lead to a system of equations that has a unique solution provided the covariances have been specified in a consistent manner. The resulting predictor is thereby called a "BLUP": Best Linear Unbiased Predictor.
Where the Variogram Comes In
Finding these equations requires operationalizing the program just described. This is done by writing down the covariances between the predictor and the observations thought of as random variables. The algebra of covariances causes the covariances among the observed values to enter into the Kriging equations, too.
At this point we reach a dead end, because those covariances are almost always unknown. After all, in most applications we have observed only one realization of each of the random variables: namely, our dataset, which constitutes just one number at each distinct location. Enter the variogram: this mathematical function tells us what the covariance between any two values ought to be. It is constrained to ensure that these covariances are "consistent" (in the sense that it will never give a set of covariances that are mathematically impossible: not all collections of numerical measures of "relatedness" will form actual covariance matrices). That is why a variogram is essential to Kriging.
References
Because the immediate question has been answered, I will stop here. Interested readers can learn how variograms are estimated and interpreted by consulting good texts such as Journel & Huijbregts' Mining Geostatistics (1978) or Isaaks & Srivastava's Applied Geostatistics (1989). (Note that the estimation process introduces two objects called "variograms": an empirical variogram derived from data and a model variogram that is fitted to it. All references to "variogram" in this answer are to the model. The call to vgm
in the question returns a computer representation of a model variogram.) For a more modern approach in which variogram estimation and Kriging are appropriately combined, see Diggle & Ribeiro Jr.'s Model-based Geostatistics (2007) (which is also an extended manual for the R
packages GeoR
and GeoRglm
).
Comments
Incidentally, whether you are using Kriging for prediction or some other algorithm, the quantitative characterization of relatedness afforded by the variogram is useful for assessing any prediction procedure. Notice that all spatial interpolation methods are predictors from this point of view--and many of them are linear predictors, such as IDW (Inverse Distance Weighted). The variogram can be used to assess the average value and dispersion (standard deviation) of any of the interpolation methods. Thus it has applicability far beyond its use in Kriging.