Background: For consistency when $\eta_0$ is known, we typically need a function $S(\theta, \eta)$ such that for every $\epsilon > 0$ we have
$$\sup_{\theta \in \Theta} \frac{| S_n(\theta,\eta_0) - S(\theta,\eta_0) |}{1 + | S_n(\theta,\eta_0)| + |S(\theta,\eta_0) |} \xrightarrow{p}\ 0$$
$$\inf_{|\theta - \theta_0| > \delta}| S(\theta,\eta_0) | > 0 = |S(\theta_0, \eta_0)|$$
with $S_n(\tilde{\theta},\eta_0) = op(1)$.
Note that a more restrictive version of the first assumption is
$$\sup_{\theta \in \Theta} | S_n(\theta,\eta_0) - S(\theta,\eta_0) | \xrightarrow{p}\ 0$$
From the infimum condition, for any $\delta >0 $ we have an $\epsilon
> 0$ such that
$$ P\left( \left| \tilde{\theta} - \theta_0 \right| > \delta \right) \le
P\left( \left| S(\tilde{\theta},\eta_0) \right| \ge \epsilon \right) $$
Consistency can then be proved through
$$ \begin{align}| S(\tilde{\theta},\eta_0) | &\le | S_n(\tilde{\theta}, \eta_0) | + |S(\tilde{\theta},\eta_0) - S_n(\tilde{\theta}, \eta_0) | \\
&\le op(1) + op(1+|S_n(\tilde{\theta},\eta_0)| + |S(\tilde{\theta}, \eta_0)|) \\
&= op(1 + S(\tilde{\theta}, \eta_0)) = op(1) \end{align}
$$
Hence $P\left( | S(\tilde{\theta},\eta_0) | \ge \epsilon \right) \to 0$ which proves consistency.
Solution:
Suppose that in addition to the previous assumptions, either
(1) $S_n(\theta,\eta)$ is stochastically continuous uniformly in $\theta$ with respect to $\eta$ at $\eta_0$
or
(2) $S(\theta,\eta)$ is continuous uniformly in $\theta$ with respect to $\eta$ at $\eta_0$
with $S_n(\hat{\theta},\hat{\eta}) = op(1)$.
If (1) is true the proof is trivial, with
$$ \begin{align} |S_n(\hat{\theta},\hat{\eta})| &\le |S_n(\hat{\theta},\eta_0)| + |S_n(\hat{\theta},\hat{\eta}) - S_n(\hat{\theta},\eta_0)| \\
&\le |S_n(\hat{\theta},\eta_0)| + \sup_{\theta \in \Theta}|S_n(\theta,\hat{\eta}) - S_n(\theta,\eta_0)| \\
&= |S_n(\hat{\theta},\eta_0)| + op(1)
\end{align}$$
with the last line true because of (1).
We conclude that the $\hat{\theta}$ also satisfies $S_n(\hat{\theta},\eta_0) = op(1)$, and the theory in the background can be applied automatically.
If (2) is true, from the infimum condition, we get that for any $\delta >0 $ we have an $\epsilon_1 > 0$ and $\epsilon_2 > 0$ such that
$$\inf_{\theta :|\theta-\theta_0| > \delta}\inf_{|\eta -\eta_0| \le \epsilon_2 }| S(\theta,\eta) | > \epsilon_1 $$
Therefore, we have
$$ P\left( \left| \hat{\theta} - \theta_0 \right| > \delta \right) \le
P\left( \left| S(\hat{\theta},\hat{\eta}) - S(\theta_0,\hat{\eta}) \right| > \epsilon_1 \right) + P(|\hat{\eta} - \eta_0| > \epsilon_2)$$
The last term goes to zero as $n \to \infty$.
Then, we have
$$ \begin{align}
| S(\hat{\theta},\hat{\eta}) - S(\theta_0,\hat{\eta}) | &\le
|S(\hat{\theta},\eta_0) - S(\theta_0,\eta_0)| \\
&+
|S(\hat{\theta},\hat{\eta}) - S(\hat{\theta},\eta_0)| +
|S( \theta_0,\hat{\eta}) - S(\theta_0,\eta_0)|
\\
&\le op(1) + 2\sup_{\theta \in \Theta}|S( \theta,\hat{\eta}) - S(\theta,\eta_0)| \\ &= op(1) \end{align}
$$
where the last line is true because of (2).