Almost all variogram estimators use Euclidean distances; that is, they apply the Pythagorean formula to the coordinates.
Issues
For study areas covering most of the earth, this is a challenging problem. One issue is that distortions in the metric arising from using projected coordinates will differentially affect distances at different locations. This results in a kind of horizontal "smearing" of the experimental variogram as distances (the x-coordinate in the variogram) are shifted higher or lower than they truly are. It is important to keep the range of relative metric distortions reasonably small, perhaps no greater than 10-20% across the whole area.
For places of limited north-south extent and well away from the poles, the variation in metric distortion from using latitudes and longitudes (as if they were Euclidean coordinates) is relatively small and harmless. However, the relative lengths of one degree of longitude (which are scaled by the cosine of the latitude) are large enough to introduce noticeable north-south, east-west anisotropy in the structure (except at points close to the Equator). This is a geometric anisotropy that would be applied to any other correlation structures in the data and therefore would usually introduce unwanted complications, unless the phenomenon under study is expected to have an anisotropy with this orientation (which might be the case for certain long-range weather patterns, for instance).
Solutions
There are two relatively simple solutions. One, which is in common use, is to use a suitable projected coordinate system. For study areas up to the size of a small country, a national coordinate system will have so little distortion that the Euclidean calculations are more than accurate enough. For instance, in Canada south of the Arctic Circle, use a Lambert Conformal Conic projection.
A work-around to cope with the inevitable distortions on a world-scale study is to tile the study area into overlapping areas, each of which can be projected with suitable accuracy. Perform the kriging or other estimation separately in each area, then mosaic the results. They will differ a little on the overlap sections, but any differences should be relatively minor. If not, you can use the kriging estimation variances to form inverse-variance weighted averages of the predicted values wherever they overlap. (This use of inverse variances for weighting is just a heuristic, because the overlapping predicted values typically are strongly correlated, being based on the same sets of neighbors.)
For larger study areas, extending over more than a hemisphere, you can convert the (longitude, latitude) coordinates $(\lambda,\phi)$ into 3D geocentric Cartesian coordinates $(x,y,z)$ = $R(\cos{\lambda}\cos{\phi},\sin{\lambda}\cos{\phi},\sin{\phi})$ and compute distances with them. This will not introduce any artificial anisotropy. In effect it changes the true variogram $\gamma(h)$, where $h$ is geodesic distance on a sphere of radius $R$, to $\gamma(2R\arcsin\frac{h}{2R})$. This is just a mild re-expression of the lag distance. Because $\frac{d}{d\theta}\sin{\theta}|_0=1$, this does not change the slope at the origin--which is crucial for correct modeling of the random field--and because the change is solely a re-expression of the distance $h$, neither the sill nor the nugget will be altered.
Complications and pitfalls
It is not difficult to do variographic modeling in (lon, lat) coordinates if you really want to, but remember that (a) if the variogram is to be used for estimating integrals over extensive areas (as in reserve estimation), the usual formulas will not apply and (b) if it will be used for interpolation and mapping--which can involve calculations of dozens of distances at millions of points--specialized kriging software would be needed and it's going to take a performance hit when computing all those spherical distances.
Also bear in mind that when variograms are used for such estimation and interpolation, as in kriging, usually only short distances are involved. Thus, the long-range distance distortion from using the $(x,y,z)$ solution is of little practical concern. However, somehow you have to persuade the kriging program to use all three coordinates. If your program will handle 3D coordinates and let you specify the prediction points (instead of assuming they lie on a rectangular grid), then you can generate a mesh of points on the sphere in $(x,y,z)$ coordinates, pass them as input to the kriging program, and then map the kriging output in any way you see fit--including with (lon, lat) coordinates.