2

Am I right in thinking that it is the average of the sum of $n$ different populations means?

Here it is used in the context that confused me. It's the Chebychev WLLN, apparently.

"If $x_i, i = 1, . . ., n$ is a sample of $n$ observations such that $E[x_i] = \mu_i < \infty$ and Var[$x_i] = \sigma_i^2$ such that $\bar\sigma_i^2/n = (1/n^2)\Sigma_i \sigma_i^2 \rightarrow 0$ as $n \rightarrow \infty$ then $plim(\bar x_n - \bar\mu_n$) = 0."

Is this saying that each sample of $i$, corresponds to it's own population of $i$ (from above, $E[x_i] = \mu_i$) and as the sample get bigger we have to average over populations?

So if I were to draw the random variable 1 from a population of {1,2,3} and the random variable 4 from the population {4,5,6} then the population of my sample 1,4 is {1,2,3,4,5,6}?

amoeba
  • 93,463
  • 28
  • 275
  • 317
EconStats
  • 765
  • 1
  • 8
  • 15

1 Answers1

4

You are right.

"If $x_i, i = 1, . . ., n$ is a sample of $n$ observations such that $E[x_i] = \mu_i < \infty$ and Var[$x_i] = \sigma_i^2$ such that $\bar\sigma_i^2/n = (1/n^2)\Sigma_i \sigma_i^2 \rightarrow 0$ as $n \rightarrow \infty$ then

$$\lim_{n\rightarrow \infty}P\left(\left|\frac 1n\sum_{i=1}^nX_i - \frac 1n\sum_{i=1}^nE(X_i)\right|<\epsilon\right) =1$$

I guess you can make the notational mapping.

Since by design we assume different moments for each $X_i$, each comes from a different population. So if by $\{1,2,3\}$ you mean values of the index $i$, then $\{1,2,3\}$ is not a population, but a set including three values of the index with each value representing a different population.

If you consider the random variables $\{X_1,X_4\}$, it is a pair coming from two different populations -you do not "unite" the two populations "into one" because, being different with respect to the object of study (convergence of sample moments), how could they form a single population (for the purposes of the specific study)? Have you contemplated how is the abstract concept of "statistical population" defined?

Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241
  • No I didn't intend for {1,2,3} to be the index values of $i$, I meant for {1,2,3} to be the population that the sample $x_1$ is drawn from. For example, the sample $x_1$ has one element {1} but $\mu_1 = 2$ (1+2+3/3) – EconStats Jul 26 '14 at 21:55
  • You mean the _values_ that the random variable can take? This is never called a "population". It is the set from which the members of the population take their values, usually called the "support". In the example you state in the question, the _joint_ support is the cartesian product $\{1,2,3\} \times \{4,5,6\}$ You have a two-dimensional vector here - the support should also be two-dimensional. – Alecos Papadopoulos Jul 26 '14 at 21:59
  • Continuing with the index notation for a moment, in this example do you think that $x_1$ and $x_n$ are draws from 2 different population but the number of observations in sample $n$ is greater than the number of observations in sample 1. If this is the case though(and I'm not saying it is, because I don't fully understand it), then we don't need to average across populations because $plim(\bar x_n - \mu_n) = 0$. I guess I'm wondering what is happening when the index is going to $n$. Are we drawing larger numbers from the same population or from different populations? – EconStats Jul 26 '14 at 22:05
  • With respect to the most recent comment, I had actually never heard the word "support" used like that (though I had often used the phrase "common support" when doing PSM. So all the values that a population mean are calculated from are called the "support"? – EconStats Jul 26 '14 at 22:15
  • As per the terminology, the "support" of the distribution of a random variable, is standard at least in mathematical statistics, for the _range_ of the random variable, which in turn is the _domain_ of the probability functions (and from where we "draw" realizations of random variables). – Alecos Papadopoulos Jul 26 '14 at 23:01
  • First, please see part B of my answer to this question http://stats.stackexchange.com/questions/107912/what-is-the-difference-between-sample-and-outcome-plus-events-and-observations/107936#107936 regarding the use of the terms "sample" and "observation". $x_n$ (preferably $X_n$) does not denote the sum or the collection of all $X$'s up to $n$. It denotes only the $n-th$ random variable (upper case) or its realization (lower-case). The notation $\bar x_n$ does not imply that by removing the bar, we have that $x_n$ is the sum of all $X$'s up to that point (it is bad notation, anyway). – Alecos Papadopoulos Jul 26 '14 at 23:07
  • Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/15982/discussion-between-econstats-and-alecos-papadopoulos). – EconStats Jul 26 '14 at 23:21
  • I was away, just saw your invitation. I am available now, if you are still at it. – Alecos Papadopoulos Jul 27 '14 at 00:23
  • Don't worry about responding, I think I have a million questions on this topic! I get the most basic version of the LLN but I think when we move out of the arena of iid sampling, heterogeneous distributions and different population I really need to see an example for it to really sink in. Kind of like doing a math problem, you need to complete one unassisted to truly understand the method – EconStats Jul 27 '14 at 00:49
  • That's true, indeed. – Alecos Papadopoulos Jul 27 '14 at 01:00