In trying to prove a statistic is NOT complete, can the counter-example function be a function of the data and the parameters of the model?

Question

Suppose that $X_1, \ldots, X_n$ are iid data from a family of distributions with parameter $\theta \in \Theta$ and that $T(\boldsymbol{X})$ is a sufficient statistic. Now suppose that we are trying to determine if $T(\boldsymbol{X})$ is a complete statistic, that is, if:

$$ E_\theta\left[g(T)\right] = 0 $$

for all $\theta \in \Theta$, and the only function $g$ that achieves this is the zero function, ie, $P_\theta(g(T)=0) = 1$, that $T$ is said to be complete for $\theta$.

If we want to show $T$ is not complete, we would just need to find a function $g(T) \neq 0$ that satisfies the above. My question is, when creating this function, can it depend on both the data and the parameters? Meaning, for example, can we theoretically define a function like: $g(y) = (y\sum_{i}^nX_i - \theta)^2$, that depends on both $X$ and $\theta$?

@Xi'an Is there a specific reason why $g$ can't depend on $\theta$, but can depend on the data? It seems that $g$ should be a deterministic function, but allowing it to depend on $X$ (a random variable) introduces a sense of stochasticity into the function. Thanks! — user321627, Dec 09 '16 at 07:40
Just think about it: if $g(t)=t-\mathbb{E}_\theta[T]$, you have $\mathbb{E}_\theta[g(T)]=0$ for every $\theta$. This turns completeness into a void notion. — Xi'an, Dec 09 '16 at 08:08
Thanks, would you know if there is intuition behind why the function $g$ is allowed to depend on $X$, the data? Couldn't we construct trivial examples based on the data to obtain exact cancellations each time? — user321627, Dec 09 '16 at 08:38
Sorry, your question makes no sense: a function of $x$ has to depend on $x$. — Xi'an, Dec 09 '16 at 09:14
I see, so you are saying that since $g$ is defined to be a function of the statistic, which in turn depends on the data, ie, $g(T(X))$, by default my function $g$ has to depend on the my data $X$? — user321627, Dec 09 '16 at 09:24
In your original question, the $y$ in $g(y) = (y\sum_{i}^nX_i - \theta)^2$ makes no sense either. When considering $g(T)$, $g(\cdot)$ is the function and $T$ is the argument of the function, for instance $g(\sum_{i=1}^n X_i)$. — Xi'an, Dec 09 '16 at 09:24
I see what you're saying, I guess what I was asking was if $T = \sum_{i=1}^nX_i$ is my statistic, would $g(T) = X_1$ be a valid functional form? — user321627, Dec 09 '16 at 09:41
$g(T)$ must be a function of $T$ not of something else, this is a basic fact about functions — Xi'an, Dec 09 '16 at 10:10

Hilario Fernandes · Accepted Answer · 2020-11-03T14:10:23.380

I believe the source of confusion here is the fact that, in statistics, the composition of functions is denoted with parentheses, i.e., $g(f)$ rather than $g \circ f$. Recall that we define $g \circ f$ as $(g \circ f)(x) := g(f(x))$ (given that $cod(f)=dom(g)$), which might be part of the reason why that notational choice is made. Why or how such a choice was made, I'm unable to answer (EDIT: check the end of the answer). However, one might see the expression $g(f)$ and assume that $f$ is an argument of $g$ (as in $f \in dom(g)$), which only causes confusion.

Before going into the question itself, let's review random variables and statistics. Given random variables $X_1,\dots,X_n$ (which are measurable functions $\Omega \to \mathbb{R}$) we have a corresponding random vector $X: \Omega \to \mathbb{R}^n$ given by $X(\omega) = (X_1(\omega),\dots,X_n(\omega))$. Then, a statistic is a function $T \circ X$ (where $T: \mathbb{R}^n \to \mathbb{R}^m$ is measurable). As an example, consider $T(x_1,\dots,x_n) = (\sum_{i=1}^n x_i, \sum_{i=1}^n x_i^2)$. Then, $T \circ X$ is simply $(\sum_{i=1}^n X_i, \sum_{i=1}^n X_i^2)$. Here $T(X)$ is preferred in statistical literature instead of $T \circ X$.

Regarding the question posed, what $g$ is allowed or not to be in the definition of complete statistics is almost never really specified. Well, $g$ is a measurable function $\mathbb{R}^m \to \mathbb{R}$ (its domain is the codomain of $T$), and once again $g(T)$ is shorthand for $g \circ T$. If $g$ is not measurable then $g \circ X$ is not necessarily measurable, which means you can't necessarily calculate its expected value.

Seeing things this way might convince you that $g$ cannot depend on $\theta$. If you decide to insert an unknown $\theta$ in the rule of $g$, what you're really doing is either

expanding $g$ from $dom(g)$ to $dom(g) \times \Theta$ or
considering a family of functions $\{g_{\theta} \mid \theta \in \Theta \}$.

In either case (they are equivalent) you're not dealing with one single measurable function $\mathbb{R}^m \to \mathbb{R}$.

As a last note, it seems to me (a math student who migrated to statistics) like some related causes of confusion happen because of not so well defined concepts (for instance, can a statistic depend on a parameter?).

EDIT: Well, it's actually quite straightforward why the notation $g(X)$ is used rather than $g \circ X$. If you'd rather not talk about probability spaces at all, then random variables are just "placeholder" symbols for their realizations. So, the $X$ in $g(X)$ is analogous to what $x$ is in $f(x) = x+2$ for instance. From that point of view your function $g$ (in the definition of complete statistics) has as its domain the set of all realizations of $X$. Still, it cannot depend on an unknown $\theta$, for the same reason discussed above.

In trying to prove a statistic is NOT complete, can the counter-example function be a function of the data and the parameters of the model?

1 Answers1