2

Suppose I have 3 random variables, $X1, X2,X3$. Define $Z$ as:

$Z=X1+X2+X3$

I want to force $Z$ to equal 1 for every "realization" of $X1,X2,X3$ ($X_i \sim Beta(a_i,b_i))$. As an example, let $X_i$ denote the percentage of votes a party will get. In a hypothetical country, there are only 3 parties. Therefore, $Z$ is the total percentage of votes, which equals to 100%. Statistically, how could someone force Z to equal 1, and by simulation or through an analytical expression derive the percentages of party A,B and C?

AlexandrosB
  • 163
  • 6
  • I am not sure if that's enough, but can you make just C=1-A-B? – David May 29 '19 at 07:11
  • Thought about your answer, but there might be realization, where C would turn to negative (e.g if parties A and B get most of the votes). Also, C might not keep its distributional properties. (I try to think of it as a transformation) – AlexandrosB May 29 '19 at 07:16
  • 1
    Your normals could be outside the intervals $[0,1]$, which makes the interpretation as fractions of votes hard. If that is a problem, you may want to ask about beta distributions, or [look for random generation on a simplex](https://stats.stackexchange.com/search?q=simplex+[random-generation]). – Stephan Kolassa May 29 '19 at 07:18
  • Correct, edited – AlexandrosB May 29 '19 at 07:19
  • 1
    @David There are two problems with your suggestion. The first is that $C$ could be negative. This reveals the second problem: $C$ will not have a Beta distribution. This is a subtle issue, to be sure: we get many questions of this nature that ask for a set of variables to have given marginal distributions subject to one or more linear relations among them. Most often such variables don't even exist. – whuber May 29 '19 at 11:43
  • @whuber When my suggestion was made, those variables were normal – David May 29 '19 at 11:44
  • 1
    @David Sorry about that--I have often worried about how edits to questions can leave comments (and even answers) that make sense but don't have their original meaning. In order to avoid this problem I have been making efforts to quote or restate assertions I am responding to, if there's room and I remember to do it! – whuber May 29 '19 at 11:56

1 Answers1

5

What you want is the Dirichlet distribution.

It is the generalisation of the Beta distribution to the multivariate case, and provides samples under the constraint that each $X_i \in (0, 1)$ and $\sum_{i=1}^3 X_i = 1$, always. In addition, you also get interpretable Beta marginals from the distribution.

Forgottenscience
  • 1,186
  • 6
  • 10
  • I have estimated the scale parameters of the beta distribution for each random variable $X_i$. How could I link these parameters with the concentration parameters of the Dirichlet distribution? – AlexandrosB May 29 '19 at 18:05