Can someone explain to me the frequentist definition of inference by Efron and Hastie using a real example?

Question

Problem

Can someone help explain the concept of inference as explained by Bradeley Efron and Trevor Hastie in their book Computer Age Statistical Inference?In Chapter 2, beginning on page 13 they introduce estimation of properties $\theta$ of an unknown distribution $F$. They have an explanation that leaves me scratching my head. They write:

The estimate of $\hat{\theta}$ is calculated from $\bf x$ according to some unknown algorithm, say, $\hat{\theta}=t(\bf x)$ where $t(x)$ in our example being the average function $\bar x = \sum\frac{x_i}{n}$. $\hat{\theta}$ is the realization of $\hat{\Theta} = t(\bf X)$. The output of $t(\cdot)$ applied to a theoretical sample $\bf X$ from $F$.

They go on later to say the whole point of the paragraph I just shared was to give the definition of inference. That is, the accuracy of observed estimator $\hat{\theta} = t(\bf x)$ is the probabilistic accuracy of $\hat{\Theta} = t(\bf X)$ as an estimator of $\theta$.

As examples, they provide the following:

$\mu = E_F\{\hat{\Theta}\}$ as the expectation
bias as $bias = \mu - \theta$
variance as $var = E_F\{(\hat{\Theta} - \mu)^2\}$.

Huh? What are they trying to communicate here? Let me go back to the basics. I get that $\bf x$ is observed data such that $\bf x $ $= (x_1, x_2, .., x_n) $. I understand that they define $\bf X$ $=(X_1, X_2, ..., X_n)$ indicate $n$ independent draws from a probability distribution $F$. However, I am getting completely confused at what $\hat{\Theta}$ and $\bf X$ are doing here to help our inference definition. Inference of what?

Questions

I have the following questions from the above reading:

Is it possible to explain this definition using a real life example/distribution, such as the normal distribution?
What is the conceptual difference between $\hat{\theta}$ and $\hat{\Theta}$?
What does $t(\cdot)$ mean?

I think the above capture my confusion with the definition given above. I am self studying and I think this high overview given by the authors is beautiful, but a more applied rather than abstract explanation would do me well here. Therefore, the above are my critical questions and I am hoping a kind and well versed statistical educated person can explain this to me.

Please review [What is the difference between estimation and prediction?](http://stats.stackexchange.com/questions/17773). For a simple real-life example, worked and explained in great detail, consider my post at http://stats.stackexchange.com/questions/18603/how-can-i-calculate-margin-of-error-in-a-nps-net-promoter-score-result/18609#18609. — whuber, Dec 08 '16 at 18:44

score 3 · Accepted Answer · answered Dec 08 '16 at 18:51

Answering your questions in reverse:

$t(\cdot)$ is just a function which happens to turn observations into an estimate
$\hat \Theta$ is an estimator of $\theta$. Since it is a function of a random variable $\mathbf X$, it too is a random variable and has a distribution. Meanwhile $\hat \theta$ is an estimate of $\theta$ made after having observed $\mathbf x$ and so has a particular (possibly vector) value
Suppose you had a normal distribution $X$ with mean and variance parameters $\theta=(m,v)$. Given $n$ observations $\mathbf{X}=(X_1,X_2,\ldots X_n)$ you might think that a sensible estimator could be $$\displaystyle \hat \Theta = t(\bf{X})=\left(\tfrac1n \sum_i X_i, \tfrac1n \sum_j \left(X_j-\tfrac1n \sum_i X_i\right)^2 \right)$$ and it turns out that this gives you an unbiased estimate of the mean but a biased estimate of the variance, because the expectation would be $E[\hat \Theta]=(m, \frac{n}{n-1}v)$ making the bias $(0,-\frac{1}{n-1}v)$

I like Henry's answer. Efron's notation can be a little confusing. It all makes sense when you consider Henry's example. — Michael R. Chernick, Dec 08 '16 at 20:25
So in other words, $\theta$ is what we calculate from our data and $\Theta$ from our assumptions of the underlaying distribution given $F$ (which is reality cannot never be truly observed). In other words, one we calculate from data and the other from theory to estimate properties of numbers we might really see in experiments. "inference", right? — hlyates, Dec 09 '16 at 17:55

Can someone explain to me the frequentist definition of inference by Efron and Hastie using a real example?

Questions

1 Answers1