28

In my probability class the terms "sums of random variables" is constantly used. However, I'm stuck on what exactly that means?

Are we talking about the sum of a bunch of realizations from a random variable? If so, doesn't that add up to a single number? How does a sum of random variable realizations lead us to a distribution, or a cdf / pdf / function of any kind? And if it isn't random variable realizations, then what exactly is being added?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Gosset
  • 445
  • 6
  • 7
  • 1
    By 'realizations of a random variable' I assume you mean the actual observed values. What is being summed in the 'sum of random variables' is the random variables before they are observed. Imagine calculating the weight of the next 5 people to get on the elevator. You dont know their weights (yet) and so they are each a random variable. But you probably would like to know something about the distribution of the sum of their weights. – PeterR May 01 '14 at 19:33
  • @PeterR This is what I don't understand. How does it even make sense to talk about adding something that doesn't have a value yet? Is it a metaphorical type of summing? – Gosset May 01 '14 at 19:47
  • 1
    I think your problem is that you don't understand what is a random variable. If you get this concept then the sum will come easily too. – Aksakal May 01 '14 at 19:49
  • 1
    @Aksakal Isn't the fact that I posted this question evidence of that already? Perhaps if you do know it, you could clarify the concept? – Gosset May 01 '14 at 19:56
  • Great answers have been given. Another good example is the sum of two dice, $X+Y$. The result is clearly random (you don't know in advance what the sum of both of the die will be). We know that $X,Y\sim Unif(1,6)$ and independent. It turns out that $X+Y$ has a triangular distribution. – bdeonovic May 02 '14 at 12:00

4 Answers4

46

A physical, intuitive model of a random variable is to write down the name of every member of a population on one or more slips of paper--"tickets"--and put those tickets into a box. The process of thoroughly mixing the contents of the box, followed by blindly pulling out one ticket--exactly as in a lottery--models randomness. Non-uniform probabilities are modeled by introducing variable numbers of tickets in the box: more tickets for the more probable members, fewer for the less probable.

A random variable is a number associated with each member of the population. (Therefore, for consistency, every ticket for a given member has to have the same number written on it.) Multiple random variables are modeled by reserving spaces on the tickets for more than one number. We usually give those spaces names like $X,$ $Y,$ and $Z$. The sum of those random variables is the usual sum: reserve a new space on every ticket for the sum, read off the values of $X,$ $Y,$ etc. on each ticket, and write their sum in that new space. This is a consistent way of writing numbers on the tickets, so it's another random variable.

Figure

This figure portrays a box representing a population $\Omega=\{\alpha,\beta,\gamma\}$ and three random variables $X$, $Y$, and $X+Y$. It contains six tickets: the three for $\alpha$ (blue) give it a probability of $3/6$, the two for $\beta$ (yellow) give it a probability of $2/6$, and the one for $\gamma$ (green) give it a probability of $1/6$. In order to display what is written on the tickets, they are shown before being mixed.

The beauty of this approach is that all the paradoxical parts of the question turn out to be correct:

  • the sum of random variables is indeed a single, definite number (for each member of the population),

  • yet it also leads to a distribution (given by the frequencies with which the sum appears in the box), and

  • it still effectively models a random process (because the tickets are still blindly drawn from the box).

In this fashion the sum can simultaneously have a definite value (given by the rules of addition as applied to numbers on each of the tickets) while the realization--which will be a ticket drawn from the box--does not have a value until it is carried out.

This physical model of drawing tickets from a box is adopted in the theoretical literature and made rigorous with the definitions of sample space (the population), sigma algebras (with their associated probability measures), and random variables as measurable functions defined on the sample space.

This account of random variables is elaborated, with realistic examples, at "What is meant by a random variable?".

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • 4
    +1 exemplary post. I hope you don't mind the impertinent question, but what was the illustration done in? – Glen_b May 01 '14 at 23:01
  • 4
    @Glen_b PowerPoint :-). The image of a box is from http://mymiddlec.files.wordpress.com/2013/09/empty-box.jpg. The tickets are PowerPoint graphics. (There's nothing impertinent about such questions!) I grouped the whole bunch, pasted it into Paint, and used that to save it as a .png file. – whuber May 02 '14 at 01:48
  • I'm missing something but it seems like you are just writing multiple numerical labels on each member of population. All alphas have X=1, Y=2 and hence X+Y= 3 .. X, Y and X+ Y have exactly same distribution, shifted by a value here a value there, because of different lebels – MiloMinderbinder Sep 30 '19 at 01:01
  • @Milo That's incorrect. Try some examples. For instance, consider a box with three $(X,Y)$ tickets with the values $(0,3),(1,2),(3,0):$ all the distributions of $X,$ $Y,$ and $X+Y$ are different. – whuber Sep 30 '19 at 14:12
  • @whuber - I said "X, Y and X+ Y have exactly same distribution, shifted by a value here a value there" ... i meant it is the same shape for all random variables depending upon the labels you provided . for example in the answer in your example all (for X,Y,X+Y) relative frequencies are 1:2:3. just the labels attached to them has changed. essentially three bars of length l,2l,3l is what is the distribution of X,Y,X+Y. where they are placed depends upon the label you provided – MiloMinderbinder Sep 30 '19 at 15:57
  • @Milo The "labels attached to them" are called *random variables.* They are fundamental to all probabilistic analysis of data. What you mean by "same shape" is unclear--it's impossible to have "relative frequencies [of] 1:2:3" with just three objects--but it appears you might be referring to the underlying probability measure on the sample space. One of the beauties of this (standard) mathematical setup is how it cleanly separates the probability measure from the random variables. – whuber Sep 30 '19 at 16:02
  • 1
    @whuber - should have written frequencies. Not well versed in mathematical jargons to say 'underlying probability measure'. anyhow you are getting my drift. I am beginning to see how i can play around with numbers on tickets to give it the desired probability distribution. At cursory level this approach just seemed like a wordplay with different 'labels' and hence was not seeing it clearly. this would be like 50th time you have helped me on this site. thank you – MiloMinderbinder Sep 30 '19 at 16:11
  • 1
    @Milo You're welcome. I see now that you were reacting to the example in this answer rather than the example I gave in the preceding comments. The answer's example indeed does have three different tickets with relative frequencies 1:2:3, and that is all that "probability measure" means in this case. This isn't *just* jargon, though: there's a profound need for the underlying concepts. See, *inter alia,* https://stats.stackexchange.com/questions/199280 for some nice accounts. – whuber Sep 30 '19 at 16:17
  • @whuber to model an example burnoulli r.v. with p=1/2 I'd have two tickets with 1 on one and 0 on other ticket. How is it extended to sum of two such burnoulli r.v from there? – MiloMinderbinder Oct 08 '19 at 02:18
  • @Milo Create a box with four tickets corresponding to all four possible realizations of tickets from two *separate* Bernoulli boxes. For instance, maybe the tickets in the first box correspond to the answer given by a survey subject to question 1 on the survey ("yes" and "no") while the tickets in the second box model the answer to question 2 (also "yes" and "no"). If $X$ is the random variable assigning $1$ to "yes" on Q1 and $Y$ assigns $1$ to "no" on Q2, then after computing $X+Y$ for each ticket you will find that one has $2$ on it, two have $1$ on them, and one has $0$ on it. – whuber Feb 05 '22 at 21:41
  • Thankfully, we have now addressed my concern directly!! As you can see the two Bernoulli random variables are not defined on the same domain. For one it is the answer to the question 1 and the question 2 for the other. I have heard and read it time and again that since rv is a function, two rv can be added only if their domains are same. How does adding these two Bernoulli with different domain figure into that constraint? – MiloMinderbinder Feb 06 '22 at 02:45
  • @MiloMinderbinder I don't follow that comment, because in my example the variables *are* defined on the same domain (namely, the same set of four tickets). This is a requisite for their sum to make sense. When variables are defined on two distinct domains a standard construction (namely, the product of sigma algebras) creates *equivalent* variables defined on a common domain. Necessarily, those variables will be independent. – whuber Feb 06 '22 at 13:42
  • Aren't the random variables in your example mapping the answer in survey 1 and survey 2 to {1,0} . Ticket, boxes etc are how you are modeling the understanding. I just wanted to see it in its very basic form. Two questions, two guys answering them, two RVs... add them. It's that simple. What you say in the second half of your comment is what gets mentioned in very few answers. That, if the variables are defined on different domains then you have to first extend their definitions to the Cartesian product of the two domains and then add. Daniel Li has answered that for me. Thank you :) – MiloMinderbinder Feb 06 '22 at 15:08
  • @Milo I don't disagree, but I would demote that construction to a mere technicality. Why? Because in most applications, as in the survey, the "tickets" *already* contain all the needed information. A "ticket" in the survey is one of *all possible ways to complete the survey.* In a two-question, yes/no survey, for instance, the set of all tickets (the "sample space," properly defined) is the set of all ordered pairs {(no,no),(no,yes),(yes,no),(yes,yes)}. Briefly, then, attention to the definition of the sample space usually suffices and Cartesian products are not needed. – whuber Feb 06 '22 at 16:11
  • I see now. Thank you for helping me so patiently whuber! – MiloMinderbinder Feb 06 '22 at 18:11
4

there is no secret behind this phrase, it is as simple as you can think: if X and Y are two random variables, their sum is X + Y and this sum is a random variable as well. If X_1, X_2, X_3,...,X_n and are n random variables, their sum is X_1 + X_2 + X_3 +...+ X_n and this sum is also a random variable (and a realization of this sum is a single number, namely a sum of n realizations).

Why do you talk so much about sums of random variables in the class? One reason is the (amazing) central limit theorem: if we sum many independent random variables, than we can "predict" the distribution of this sum (almost) independently of the distribution of the single variables in the sum! The sum tends to become a normal distribution and this is the likely reason why we observe the normal distribution so often in the real world.

jolvi
  • 156
  • 3
3

r.v. is a relation between the occurrence of an event and a real number. Say, if it's raining the value X is 1, if it's not then 0. You can have another r.v. Y equal to 10 when it's cold, and 100 when it's hot. So, if it's raining and cold then X=1, Y=10, and X+Y=11.

X+Y values are 10 (not raining cold); 11 (raining,cold), 100 (not raining,hot) and 110 (raining, hot). If you figure our probabilities of the events, then you'll get PMF of this new r.v. X+Y.

Aksakal
  • 55,939
  • 5
  • 90
  • 176
2

None of these answers gives a mathematically rigorous way to think about sum of random variable. Note that $X,Y$ needs not to be defined on the same outcome domain and even if they do, $X+Y$ cannot be understood as summing up two functions. Rather, they should be first extended to the domain $\Omega_1\times \Omega_2$. For example, let $X,Y$ be identical function of $\Omega=\{Head,Tail\}$ where $X(Head)=Y(Head)=1, X(Tail)=Y(Tail)=0$. Domain of $(X+Y)$ should be {(Head,Tail),(Tail,Head),(Head, Head),(Tail,Tail)}. Now $X,Y$ are functions on this product space where their value is determined solely by the 1st and 2nd coordinate respectively. The sum now can be understood as summation of functions as the usual sense. Note also that the $\sigma-$field and probability measure should also be defined anew. Saying $X,Y$ are independent is one way to specify the product measure.

Daniel Li
  • 121
  • 3