2

I am trying to figure out if I've got it down correctly. (Sorry it's so small, you can click on the picture to enlarge it)

EDIT: About the notation: A random experiment is performed and the set of possible outcomes is $S = \{s_1, s_2, s_3, s_4, s_5 \}$. The random variable is an injective function $X: S \to \mathbb{R}$. The pmf is a not necesarily injective function $f: \text {range} (X) \to (0, 1)$.

Ovi
  • 373
  • 2
  • 8
  • 4
    This generally looks right, but it's hard to tell without any accompanying text and descriptions of your notation. Will you supply some? – StatsStudent May 29 '17 at 19:43
  • 3
    @Analyst It doesn't look right to me. The probabilities should not be computed via the random variable $X$: they should come directly from the probability measure on the sample space. It doesn't seem possible to determine what this figure is stating or how to respond to it. – whuber May 29 '17 at 19:47
  • @Analyst1 Sorry, I added details in the OP. – Ovi May 29 '17 at 19:50
  • Thank you: that clarifies where you could benefit from our reactions and what needs improving. (+1) – whuber May 29 '17 at 19:51
  • May I ask why you believe $X$ must be injective? For instance, if $X$ models a binary (0/1) response and the sample space is a large population of people, how could it possibly be injective? The relevant property is *measurability* rather than injectivity. – whuber May 29 '17 at 19:53
  • 2
    @whuber I thought that we define it that way for convenience. If there are two elements of $S$ (say s_1, 2_2) which map to $x$, then $f(x) = P(\{s_1, s_2 \}).$ So then it seems like we cannot use the pmf function to talk about the probability of just $\{s_1 \}$ or just $\{s_2\}$. – Ovi May 29 '17 at 20:01
  • That approach seems natural, but it leads to confusion. At https://stats.stackexchange.com/a/54894 I share a relatively non-mathematical (but nevertheless rigorous) account of these concepts. The key is to distinguish the sample space from the random variables, so that you can describe situations where there are multiple random variables. The sample space is a mathematical model of the real-world things you can select for observation or measurement and the random variables model the values of those observations. It's essential that two real things can have the same value! – whuber May 29 '17 at 20:06

1 Answers1

8

Apart from an inconsequential obvious typo regarding $s1/s2$ mapping, the problem with the above diagram is that it is incomplete, in the sense that although it is correct from a mathematical point of view, it does not convey the special nature of the Probability Mass function.

There are many functions that can be defined as having for domain the range of $X$. A subset of them will also have as their range $(0,1)$. But the PMF satisfies an additional critical condition, so as to function as a probability measure:

That it sums up (for discrete measures) or integrates (for absolutely continuous measures), or a little bit of both (for mixed measures), to unity.

Without this third condition it is not a probability mass function or a probability density function. And this critical condition is not reflected in the above diagram.

Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241
  • Thanks for the reply. I tried to satisfy this critical condition as well, but I guess I didn't do it too clearly. On the real line on the right side, I tried to put 1 output at about $0.7$, another output at $0.2$, and three outputs at $\dfrac {0.1}{3} \approx 033$. – Ovi May 29 '17 at 20:07
  • 2
    @Ovi. this is a matter of visual inventiveness. Suggestion: instead of a line to the right of the diagram, use a square box of height 1 and base 1. Represent the values of the pmf as horizontal slices of the box, with some height, that fill it completely. This works for pmfs only, not for densities. – Alecos Papadopoulos May 29 '17 at 20:12