There are some subtle differences between ordinary and simple kriging, maybe that confuses you. GP regression in the way it is usually presented is analogous to simple kriging. In the Gaussian process Wikipedia entry it says that the article refers explicitly to a "zero-meaned distribution"; that is the same assumption found in simple kriging.
Also generally speaking kriging is usually performed in a 2 or 3 dimensional spaces, (eg. pollutant concentration along some given area) while most GPR toy examples are one dimensional (eg. $CO_2$ concentration in the atmosphere against time).
Ultimately kriging/GPR is an interpolation technique and most (not all) of the difference among the variants of it lays on the assumption about the mean trend $\mu(X)$ (or E[$X_t$] if you like this notation better).