Randomised estimators are defined in a more general setting than in this case and in particular need not be connected with sufficiency. In full generality, a randomised decision rule is a decision rule that returns a random decision for a given observation (or dataset). To reproduce the quote from Lehmann and Casella :
The starting point of a statistical analysis, as formulated in the
preceding sections, is a random observable $X$ taking on values in a
sample space $X$, and a family of possible distributions of $X$. It
often turns out that some part of the data carries no information
about the unknown distribution and that $A$ can therefore be replaced
by some statistic $T = T (A)$ (not necessarily real-valued) without
loss of information. A statistic $T$ is said to be sufficient for $A$,
or for the family $V = \{P_\theta,\ \theta\in\Omega\}$ of possible
distributions of $A$, or for $\theta$, if the conditional distribution
of $A$ given $T = t$ is independent of $\theta$ for all $t$.
This definition is not quite precise and we shall return to it later
in this section. However, consider first in what sense a sufficient
statistic $T$ contains all the information about $\theta$ contained in
$A$. For that purpose, suppose that an investigator reports the value
of $T$, but on being asked for the full data, admits that they have
been discarded. In an effort at reconstruction, one can use a random
mechanism (such as a pseudo-random number generator) to obtain a
random quantity $X'$ distributed according to the conditional
distribution of $X$ given $t$. (This would not be possible, of course,
if the conditional distribution depended on the unknown $\theta$.)
Then the unconditional distribution of $X’$ is the same as that of $X$
, that is, $$ P_0 (X' \in A) = P_\theta (X \in A)\quad\text{for all
}A, $$ regardless of the value of $\theta$. Hence, from a knowledge
of $T$ alone, it is possible to construct a quantity $X'$ which is
completely equivalent to the original $X$. Since $X$ and $X'$ have the
same distribution for all $\theta$ , they provide exactly the same
information about $\theta$ (for example, the estimators $\delta(X)$
and $\delta(X')$ have identical distributions for any $\theta$).
The estimator $\delta(X')$ is possibly random given the observed realisation $t$ of $T(X)$. Each time $\delta(X')$ is considered, unless $\delta(X')=\delta(X)$ with probability one, a different realisation occurs. This means that $\delta(X')$ is a random variable for the observed realisation of the original data $X$, rather than a deterministic value, which explains for the following quote where the notion of randomised estimator is introduced.
The construction of $X’$ is, in general, effected with the help of an
independent random mechanism. An estimator $S(X')$ depends,
therefore, not only on $T$ but also on this mechanism. It is thus not
an estimator as defined in Section 1, but a randomized estimator.
Quite generally, if $X$ is the basic random observable, a randomized
estimator of $g(\theta)$ is a rule which assigns to each possible
outcome $x$ of $X$ a random variable $Y(x)$ with a known distribution.
When $X = x$, an observation of $Y(x)$ will be taken and will
constitute the estimate of $g(\theta)$. The risk, defined by (1.10),
of the resulting estimator is then $$ \int_{\mathcal X}\int_\mathcal{Y} L(\theta , y)dP_Y(y|X=x) dP_X(x;\theta, $$ where the probability measure in
the inside integral does not depend on $\theta$.
In the special case of the Normal distribution proposed in the question,
- the new sample $(X_1^\prime,\ldots,X_n^\prime)$ is generated conditional on $T(X^\prime)=\bar X$ and therefore $\bar{X^\prime}=\bar X$. In particular, if $\delta(X)=\bar X$, then $\delta(X^\prime)=\bar X$
- the proposed estimator $\delta(X^\prime)$, namely the sample mean, is then non-randomised, which definitely cancels the appeal of the example! If instead, the median of the sample was considered as the estimator of the mean $\mu$, the estimator $\delta(X^\prime)$ would then be truly randomised, since the new normal sample $X^\prime$ would differ from $X$ and the median would remain random conditional on $\bar X$. (With expectation $\bar X$, though.)