Problem
Can someone help explain the concept of inference as explained by Bradeley Efron and Trevor Hastie in their book Computer Age Statistical Inference?In Chapter 2, beginning on page 13 they introduce estimation of properties $\theta$ of an unknown distribution $F$. They have an explanation that leaves me scratching my head. They write:
The estimate of $\hat{\theta}$ is calculated from $\bf x$ according to some unknown algorithm, say, $\hat{\theta}=t(\bf x)$ where $t(x)$ in our example being the average function $\bar x = \sum\frac{x_i}{n}$. $\hat{\theta}$ is the realization of $\hat{\Theta} = t(\bf X)$. The output of $t(\cdot)$ applied to a theoretical sample $\bf X$ from $F$.
They go on later to say the whole point of the paragraph I just shared was to give the definition of inference. That is, the accuracy of observed estimator $\hat{\theta} = t(\bf x)$ is the probabilistic accuracy of $\hat{\Theta} = t(\bf X)$ as an estimator of $\theta$.
As examples, they provide the following:
- $\mu = E_F\{\hat{\Theta}\}$ as the expectation
- bias as $bias = \mu - \theta$
- variance as $var = E_F\{(\hat{\Theta} - \mu)^2\}$.
Huh? What are they trying to communicate here? Let me go back to the basics. I get that $\bf x$ is observed data such that $\bf x $ $= (x_1, x_2, .., x_n) $. I understand that they define $\bf X$ $=(X_1, X_2, ..., X_n)$ indicate $n$ independent draws from a probability distribution $F$. However, I am getting completely confused at what $\hat{\Theta}$ and $\bf X$ are doing here to help our inference definition. Inference of what?
Questions
I have the following questions from the above reading:
- Is it possible to explain this definition using a real life example/distribution, such as the normal distribution?
- What is the conceptual difference between $\hat{\theta}$ and $\hat{\Theta}$?
- What does $t(\cdot)$ mean?
I think the above capture my confusion with the definition given above. I am self studying and I think this high overview given by the authors is beautiful, but a more applied rather than abstract explanation would do me well here. Therefore, the above are my critical questions and I am hoping a kind and well versed statistical educated person can explain this to me.