Actually I thought Gaussian Process is a kind of Bayesian method, since I read many tutorials in which GP is presented in Bayesian context, for example, in this tutorial, just pay attention to page 10.
Suppose the GP prior is $$\pmatrix{h\\ h^*} \sim N\left(0,\pmatrix{K(X,X)&K(X,X^*)\\ K(X^*,X)&K(X^*,X^*)}\right)$$, $(h,X)$ is for the observed training data, $(h^*,X^*)$ for the test data to be predicted. And the actually observed noisy output is $$Y=h+\epsilon$$, where $\epsilon$ is the noise, $$\epsilon\sim N(0,\sigma^2I)$$. And now as shown in the tutorial, we have $$\pmatrix{Y,Y^*}=\pmatrix{h\\ h^*}+\pmatrix{\epsilon\\ \epsilon^*}\sim N\left(0,\pmatrix{K(X,X)+\sigma^2I&K(X,X^*)\\ K(X^*,X)&K(X^*,X^*)+\sigma^*I}\right)$$, and finally by conditioning on $Y$, we could have $p(Y^*|Y)$, which is called as predictive distribution in some books or tutorials, but also called posterior in others.
QUESTION
According to many tutorials, the predictive distribution $p(Y^*|Y)$ is derived by conditioning on $Y$, if this is correct, I don't understand why GP Regression is Bayesian? Nothing about Bayesian is used in this conditional distribution derivation, right?
However, I don't actually think the predictive distribution should be just the conditional distribution, I think it should be $$p(Y^*|Y)=\int p(Y^*|h^*)p(h^*|h)p(h|Y)dh$$, in the above formula, $p(h|Y)$ is the posterior, right?