I was wondering if someone could flesh out the probabilistic interpretation of using the Radial Basis Function to compute the probability between an observation and some reference value.
My question is partially motivated by the top answer of this reddit thread:
The RBF kernel is a standard kernel function in $R^n$ space because it has just one free parameter, $\gamma$, and satisfies the condition $K(x,x') = K(x',x)$. More specifically, one way to think of the RBF kernel is that if we assume $x'$ is characteristic of some gaussian distribution (it is the mean value of that distribution), then $RBF(x,x')$ is the probability that $x$ is another sample from that distribution. In this interpretation, $\gamma$ is related to the tunable variance of that distribution.
Does this mean that if we have an observation $\bf{s}$ and we want to know if $\bf{s}$ is generated by a source $\bf{q}$ if $\bf{s}$ is a noisy version of $\bf{q}$, then we can say:
$$P(\mathbf{s} \text{ generated by } \mathbf{q}) \propto \text{exp}(-\gamma d(\mathbf{s},\mathbf{q}))$$ $$P(\text{belongs to a Gaussian region defined by } \mathbf{q} | \mathbf{s}) \approx \text{exp}(-\gamma d(\mathbf{s},\mathbf{q}))$$
where $d(\mathbf{s},\mathbf{q})$ is the distance between $\bf{s}$ and $\bf{q}$ and $\gamma$ is as described in the above quote.
Does this all seem consistent? That this probability is a direct consequence of the RBF comparing an observation to some mean value (or reference value or source value)?
Any references/links to tutorials are most welcome.