6

I am going through Harvard's Statistics 110 course.

In lecture 11 (https://youtu.be/TD1N4hxqMzY?t=4m38s), professor Blitzstein says that many students confuse random variables (RVs) with their distribution. As an analogy to help students separate these concepts, he says that RVs can be thought as houses, and distributions as blueprints for houses.

Does this mean that we can view RVs as a concrete value from the distributions (for example, after having done an experiment, we now have instances and no probabilities and involved anymore), or am I misinterpreting his analogy?

samlaf
  • 477
  • 3
  • 10
  • See https://stats.stackexchange.com/questions/50 for answers to the question "what is a random variable?" It's hard to tell what you might mean by "concrete value from the distributions." – whuber Jul 17 '17 at 16:18
  • Not sure what I meant myself. I was just trying to understand his analogy of (RV = house, dist = blueprint) by mapping the formal definition of RV (function from outcome space to R or some other number system) to his analogy, and I got confused. – samlaf Jul 17 '17 at 16:26
  • Your comment in the link helped me: 'When the "events" become "known," what happens to the random variable? According to this answer, it can no longer exist!' So then a RV can't really be a "house", since there are no "random events" associated with a house, everything is determined (whether you used wood instead of stone, etc.). Right? – samlaf Jul 17 '17 at 16:29
  • 2
    The "random variable" is a fixed mathematical object. I have adopted a metaphor of Freeman *et al* in writing about it as a "consistent way of labeling tickets in a box." The *realization,* in this metaphor, is the physical process of drawing a ticket from that box (and then returning it, so that the contents of the box remain unchanged). That should make it clear that "realizing" a random variable does not change the random variable. It also makes a clear distinction between a realization and the distribution (which describes the frequencies of numbers on those tickets). – whuber Jul 17 '17 at 17:30

6 Answers6

4

Yes it is a value, but no it doesn't necessarily have to be realized. A random variable can be realized or unrealized. Just as a house can be built or unfinished. The analogy is meant to emphasize that a random variable can be thought of as the value, while a distribution is a function that describe the probability of those values. A random variable is not the thing doing the generating (blueprint, probability distribution); rather it is the thing being generated (house, random variable).

You can take this a step further. A random variable can be "looked at" in a few ways. All of these entities are separate things but "describe" the same phenomenon. Depending on the question you want to answer, you might use a random variable's

  1. value/label/representation, usually denoted by capital letters at the end of the alphabet. This is what he means when he talks about a random variable. This describes the outcome of one draw. Even though this convention is not always followed, usually it is capitalized if it has not be observed concretely. And it is written with a lower-case letter if it has.
  2. probability density/mass function. This is usually what is meant by a random variable's "distribution." A random variable will have one of these if it is discrete (pmf) or continuous (pdf). Sometimes it is denoted by $f_X(x; \theta)$ or $p_X(x;\theta)$, or something similar. They are useful for finding a random variables expected value, or variance, or other expectations. They can also be summed (discrete rvs) or integrated (continuous rvs) to give you probabilities of certain events or outcomes of the random variable.
  3. cumulative distribution function. This is a function that gives you probabilities that a random variable can be in a certain range.
  4. moment generating function, when they exist they ``completely define a random variable," good for finding the distribution of linear combinations of independent random variables. They are also another way to find a random variable's moments.
  5. characteristic function, similar to the mgf above.
Taylor
  • 18,278
  • 2
  • 31
  • 66
  • 1
    This answer furthers the confusion. Random variables and distributions are quite distinct things, both mathematically and conceptually, yet you seem to equate them. – whuber Jul 17 '17 at 16:17
  • @whuber true. See new edits. – Taylor Jul 17 '17 at 16:41
  • But according to whuber, a random variable doesn't exist once it is realized (it is no longer random). So the way I see it now, a RV would be analogically equivalent to a prospective house (have decided to build a house at some location, but still not set on the exact specifications, which are given by the distribution). Once the house is completed however, we can no longer talk of it being a random variable. Is this correct? – samlaf Jul 17 '17 at 17:06
  • @samlaf yes, that's right. We could call that a realization of a random variable, though. The words aren't off limits, but it's not random anymore. The distinction being made by the analogy is not the random/nonrandom one, but rather the generator/generated one. – Taylor Jul 17 '17 at 17:20
  • 1
    Thank you very much for the added remarks in your post: in particular, they help me understand the original house/blueprint analogy (+1). – whuber Jul 17 '17 at 17:25
1

I'm going through the course, too. The Aha moment came with the distinction that a random variable is a function. Blitzstein isn't the only one who says this, but it was the first time I finally got it.

An r.v. is not an algebraic variable. In fact, it even makes sense if you make up privately, for didactic purposes only, a new name for it instead of variable. Just for one minute, you can beneficially lose any preconception you have for what a variable is in another context.

An r.v. maps one or more outcomes in the sample space to the real number line. It is therefore a function. The domain of an r.v. (a function) is the sample space, i.e. possible outcomes. The range of an r.v. (a function) is the support, namely the possible values of the r.v.

Sample space to real number support. That function is the r.v.

Support to probability. That function is the Probability Mass Function for a discrete r.v. or a Cumulative Distribution Function for a continuous r.v. Support (the real number the r.v. mapped to) was the range of the r.v. and it is now the domain of the PMF or CDF.

Until you run an experiment, you have no outcomes. You have probabilities of outcomes. The probability distribution tells you what those are for the r.v.'s support. When you run an experiment, you have outcomes. The name for that is an event. An expression like the random variable $X = 7$ in a probability formula is not an expression of algebraic equality. It is an expression of an event. The experiment had 1 or more outcomes which r.v. $X$ mapped to the number 7.

I can see the inclination to say this "instantiated" the r.v. Maybe the analogy of a programmatic class being allocated to memory as an instantiated object is a helpful visualization. However, the most helpful visualization for me has been the distinction that an r.v. is a function.

I think what gets "instantiated" in an experiment is the outcome! The sample space expressed the potentiality. The experiment realizes outcomes from the sample space, yielding events, which are subsets of the sample space. Before the experiment you had a function that said how you would map an event to the number line. That's the r.v. You could describe the probabilities of those events using a PMF or CDF. Once you have an outcome, you don't have a "concrete r.v.," you have an event. The function is still an abstraction. The outcome is concrete. The mapping tells you the output of the r.v.

Interestingly, the mapped value is not to be mistaken as the outcome.

If my experiment is flipping two coins, the outcomes in the sample space are: HH, HT, TH, TT. If I define r.v. $X$ as the number of heads in the outcome, then the range of the r.v. (called its support) is {0, 1, 2}. If the outcome of my flip is TH, that's an event, namely a subset of the sample space. The r.v. maps that to 1. However, the event $X = 1$ encompasses 2 outcomes, TH and HT. The probability of this event is: $P(X = 1) = 0.5$. I picked that one on purpose to highlight that an outcome (like TH) is not a necessarily a support (like 1) and to highlight that the meaningful action of an r.v. is this mapping.

In summary, an r.v. is a function.

Jai Jeffryes
  • 123
  • 5
1

Yes you can --- this is both technically feasible and it can also aid intuition


Intuition: Probabilistic intuition is best when it is built on an epistemic foundation that views probability as a belief based on available information. For this reason, it is generally a bad idea to try to build up intuition by thinking about whether a random variable is a concrete "realised" value or a random "unrealised" value. Instead, it is more useful to think about a random variable as always having a true value, but you may or may not know that value. The random variable has either been "observed" in which case its value is known, or it is "unobserved" in which case its value is not known.

Now let's step back and look at the "house blueprint" analogy for the probability distribution. If I show you a house blueprint then you will have a fair idea of what the house will look like, but there are a lot of little random aspects that you don't know (e.g., minor variations in craftsmanship, paint-job, etc.). Suppose that I build a large number of houses from that blueprint and then I show you one of those houses. The house I am showing you is now "observed" and so you can see the structure without having to rely on the blueprint. Moreover, you can see a lot of aspects of how the house is that were not clear from the blueprint. For example, you can see what colour the house is painted, you can see if there are any cracks or imperfections in the building, and where they are, etc. For these things that you have seen with your own eyes, the blueprint is no longer giving you any information about this house. Now think about one of the houses you have not seen. For that house you are still relying on the blueprint for what you think it looks like. You are not sure what colour I have painted it, you are not sure if or where there are imperfections, cracks, etc.

This is (imperfectly) analogous to a random variable and its distribution. Once you have observed the random variable, its probability distribution is no longer giving you any information on its value, because you can now see its value. Conversely, if you have not observed the random variable, you beliefs about it are based on its probability distribution. Now, this analogy is slightly imperfect, insofar as looking at a house does not show you every aspect of the house (there are still some things you can't see where you still rely on the blueprint). A slightly better probabilistic analogy here would be to consider a house as a random vector composed of a number of random variables, and you observe some of those random variables when you look at the house.

Notwithstanding this slight imperfection in the analogy, it still serves to aid intuition, and one can imaging a "perfected" version of the analogy where it is assumed that your inspection of the house is so thorough that you observe everything about it. The value of this analogy lies in the fact that it shows when the blueprint/distribution is giving you information about the house/random variable and when it is not.

Technical feasibility: Every univariate probability distribution corresponds to a probability measure $\mathbb{P}$ that map subsets of the real numbers to a probability value between zero and one.$^\dagger$ From any distribution you can form a probability measure $\mathbb{P}_\infty$ corresponding to a sequence of independent and identically distributed random variables with that distribution. This means that if you have an initial distribution for a scalar random variable, it is always possible to define a sequence of IID random variables with that distribution. Technically speaking, if you start with any distribution $D$ then you can map this to a sequence $\mathbf{x} = (x_1,x_2,x_3,...) \sim \text{IID } D$.

This technical result ensures that we are on solid ground when we transition from thinking about a distribution to thinking about a sequence of "instantiations" of that distribution. We know that we will never encounter a situation where there is a technical impediment to transitioning from the distribution to an infinite number of "instantiations".


$^\dagger$ For technical reasons that are beyond the scope of this post, the domain of the probability measure does not include all subsets of the real numbers. Instead, the domain of the probability measure is the class of Borel sets, which includes sets that are made up from countable unions, intersections and negations of some initial real intervals.

Ben
  • 91,027
  • 3
  • 150
  • 376
1

No. In lose sense a random sample can be seen as some sort of instantiation of the random variable. However, the RV itself is not an instance of its distribution in any meaningful context. The distribution function doesn't have an instance.

Aksakal
  • 55,939
  • 5
  • 90
  • 176
0

One intuitive distribution is the Bernoulli distribution. It describes the outcome of throwing a coin, which lands head with probability $p$ and tails with prob. $q=1-p$.

  • If you throw the coin once, you will observe either head or tail. However, this outcome is the random variable, it is not the distribution. The distribution however defines, with which probability you observe head and tails. The same true for all distributions -- continuous and discrete.

Blitzstein's analogy goes a bit further, because there exists not a single Bernoulli distribution, but a family of Bernoulli distributions: For each value of $p$ you will get a different Bernoulli distribution.

Semoi
  • 574
  • 1
  • 4
  • 16
  • 1
    I don't think this is what he is referring to. A distribution is an instance (fixed parameters) of a family of distributions, but he is really talking about the difference between a RV and a given distribution (not a family of distributions). – samlaf Jul 17 '17 at 17:02
  • Well, this is how I understood Blitzstein -- as far as I remember. Furthermore, the rest was covered by whuber in the comments. – Semoi Jul 17 '17 at 18:48
0

In lecture 11, Professor Blitzstein introduces the blueprint:house analogy by saying "Word is not the thing. Map is not the territory".

In lecture 8 he discusses it in the context of a specific example - random variables $X_1, X_2, ... , X_n$, which are i.i.d. Bernoulli(p). Note that his point is not about the different values that p can take and the fact that the different values of p define a family of Bernoulli distributions, each with a different value for parameter p. In his mind he is working with a particular value of p, so let's let p stand for the probability of the event that the coin lands Heads, and fix p=0.6 to make things more clear.

What he is saying is that once the experiment is realized, random variable $X_1$ will crystalize into value 1 if on the first flip the coin lands Heads and into value 0 if it lands Tails; random variable $X_2$ will crystalize into value 1 if on the second flip the coin lands Heads and into value 0 if it lands Tails, etc. The probability of the coin landing Heads is that same on each flip, 0.6, and is dictated by the distribution (PMF). Hence the analogy: the blueprint (i.e.) the PMF is used to construct multiple houses $X_1, X_2, ... , X_n$.

Let's make things even more concrete. Since we are working with Bernoulli random variables here, we can think of the PMF or the blueprint as saying that there is a probability p of a big house being built and a probability 1-p of a small house being built; which one it will be will be determined by the outcome of a coin flip: Heads=Big House, Tails=Small House. $X_1$ is a house; big or small, we don't know but it is determined by the outcome of flipping the coin for the first time, and since there is a 0.6 probability of the coin landing Heads, that is the probability of the first house being a big house. Similarly, $X_2$ is a house; big or small, we don't know but it is determined by the outcome of flipping the coin the second time, and since there is a 0.6 probability of the coin landing Heads, that is the probability of the second house being a big house. Same with all the other random variables $X_3, ... , X_n$

Let's revisit your question: Does this mean that we can view RVs as a concrete value from the distributions (for example, after having done an experiment, we now have instances and no probabilities and involved anymore), or am I misinterpreting his analogy?

No. Not a concrete value. A misinterpretation indeed but a very good question to ask.

ColorStatistics
  • 2,699
  • 1
  • 10
  • 26