Suppose we want to model a stochastic process using Gaussian processes. We have data on $z$ (dependent variable) at some spatial points $(x_{i},y_{i})$. If our dataset is large then calculating the covariance matrix of the Gaussian process becomes difficult and that’s why approximation methods like using SPDE+FEA (Stochastic Partial Differential Equations + Finite Element Analysis) are suggested. Lindgren et al. (2011) in his paper explains how we can approximate a Gaussian process with Matern covariance function. He suggests that the solution to the below SPDE
$\kappa^{2} f-\Delta f=\epsilon / \tau$
is the Gaussian process we are looking for and the SPDE can be solved using a finite element method by using mesh and basis functions over the mesh to approximate the solution. I tried reading his paper and many other resources but only got some vague ideas about the approach and the big missing piece is that how data comes into play? I don’t see where the data is being used in the finite element method to solve the SPDE. Can someone clarify what’s happening under the hood with minimum use of sophisticated calculus and advanced Linear algebra?