A general answer is tough and +1 to @Shifer, but if you're looking for one particular example when integrating outperforms profiling then you might find the Neyman-Scott "paradox" interesting.
The problem: suppose we have $\{(x_1,y_1),\dots,(x_n,y_n)\}$ where each pair $(x_i, y_i)$ is independent and
$$
{x_i \choose y_i} \sim \mathcal N(\mu_i \mathbf 1, \sigma ^2 I_2).
$$
Thus we have $n$ pairs of Gaussian RVs where all $2n$ RVs are independent but each pair has a different mean. The goal now is to estimate $\sigma^2$.
I'm going to do this with matrices so I don't have summations all over the place, so I'll write this as$\newcommand{\e}{\varepsilon}$
$$
z = A\mu + \e
$$
where $z = (x_1, y_1, \dots, x_n, y_n)^T$, $\e \sim \mathcal N(0, \sigma^2 I_{2n})$, $\mu = (\mu_1,\dots,\mu_n)^T$, and
$$
A = \left(\begin{array}{cccccc}
1 & 0 & 0 & \dots & 0 & 0\\
1 & 0 & 0 & \dots & 0& 0\\
0 & 1 & 0 & \dots & 0& 0\\
0 & 1 & 0 & \dots & 0& 0\\
& & & \vdots & & \\
0 & 0 & 0 & \dots & 0& 1 \\
0 & 0 & 0 & \dots & 0& 1
\end{array}\right)
$$
so $A$ picks out the correct mean parameter for each pair and then the error is a spherical Gaussian.
I'll use $\tau =1 / \sigma^2$ in some places to make the math (especially the derivatives) easier.
Profiling
The likelihood is
$$
f(z | \mu, \sigma^2) = \left(\frac{\tau}{2\pi} \right)^n \exp\left(-\frac \tau 2 \|z- A\mu\|^2\right).
$$
We'll first find the MLE of $\mu$ and plug that in to get a profiled likelihood. This is just OLS linear regression so $\hat\mu = (A^TA)^{-1}A^Tz$ in this case, and therefore the profiled log likelihood is (up to some constant)
$$
\ell_p(y | \hat\mu, \tau) = -\frac \tau 2 \|z- A\hat \mu\|^2 + n\log \tau.
$$
We don't actually need this, but it's worth noting that $A^TA = 2I$ so actually $\hat\mu$ is just the mean of each pair, i.e. $\hat\mu_i = \frac{x_i + y_i}{2}$.
This leads to
$$
\frac{\partial \ell_p}{\partial \tau} = -\frac 1 2 \|z- A\hat \mu\|^2 + n\tau^{-1}.
$$
Solving for zero, and then inverting to get the MLE of $\sigma^2$, we get
$$
\hat\sigma^2 = \frac{\|z- A\hat \mu\|^2}{2n}
$$
(you can take the second derivative to show this is actually a max, and that is one of the main simplifications in using $\tau$ instead of $\sigma^2$).
Let $H = A(A^TA)^{-1}A^T$ and note that
$$
\|z- A\hat \mu\|^2 = \|(I-H)z\|^2 = z^T(I-H)z
$$
so since $z \sim \mathcal N(A\mu, \sigma^2 I)$ we've got a Gaussian quadratic form. This means
$$
E(z^T(I-H)z) = \sigma^2 \text{tr}(I-H) + \mu^TA^T(I-H)A\mu \\
= n\sigma^2
$$
(see e.g. here for a proof of this result for quadratic forms).
All together this means
$$
E(\hat\sigma^2) = \frac{n\sigma^2}{2n} = \frac{\sigma^2}2
$$
so $\hat\sigma^2$ is biased, and this bias does not go away as $n\to\infty$ (i.e. it is inconsistent). That's not good.
Integrating
So we can try something else. I'm going to suppose $\mu \sim \mathcal N(0, (\tau\lambda)^{-1} I)$ (and $\mu \perp \e$) and then I'll integrate $\mu$ out and maximize the resulting integrated likelihood (so it'll be like a MAP).
I don't actually need to evaluate the integral in this case since $\mu$ and $\e$ being independent Gaussians means the marginal distribution is also Gaussian. In particular,
$$
A\mu + \e \sim \mathcal N(0, (\tau\lambda)^{-1}(AA^T + \lambda I))
$$
so now I have
$$
f_I(z | \tau, \lambda) = \left(\frac{\tau\lambda}{2\pi}\right)^{n} |AA^T + \lambda I|^{-1/2} \exp\left(-\frac{\tau\lambda}2 z^T(AA^T+\lambda I)^{-1}z\right).
$$
I'm going to obtain my estimate by maximizing this w.r.t. $\tau$ so I'll take logs to get
$$
\ell_I(z | \tau, \lambda) = n\log \tau - \frac{\tau\lambda}2 z^T(AA^T+\lambda I)^{-1}z
$$
up to some constants. This leads to
$$
\frac{\partial \ell_I}{\partial \tau} = \frac{n}{\tau} - \frac{\lambda}{2} z^T(AA^T+\lambda I)^{-1}z
$$
so
$$
\tilde \sigma^2 = \frac{\lambda}{2n}z^T(AA^T+\lambda I)^{-1}z.
$$
This again is a Gaussian quadratic form although now $z \sim \mathcal N(0, (\sigma^2/\lambda)(AA^T+\lambda I))$ which means
$$
E(z^T(AA^T+\lambda I)^{-1} z) = \frac{\sigma^2}\lambda \text{tr}\left[(AA^T+\lambda I)^{-1}(AA^T+\lambda I)\right] \\
= \frac{2n \sigma^2}\lambda
$$
so
$$
E(\tilde \sigma^2) = \frac{\lambda}{2n} \cdot \frac{2n \sigma^2}\lambda = \sigma^2
$$
so not only is this is unbiased but it is unbiased for any valid prior variance.
This example definitely can feel a little contrived but it does align with at least the intuitive idea that when there are tons of parameters, integrating with respect to a sensible prior (and I think a Gaussian prior for normal means is often sensible) can lead to better results than profiling (I call this intuitive because I think of averages as being more stable than maxima). But I was fortunate here that everything was analytically tractable for the integration and in general you won't be so lucky. Again, in summary this is a big topic with lots of complexities but hopefully this was at least interesting.