1

Given $\theta_i$, $0 < \theta_i < 1$, a sequence of independent Bernoulli ($\theta_i$) random variables from i subpopulations, that are also independent across subpopulations. Suppose i=2 (2 distinct subpopulations in the population, $\pi_1$ and $\pi_2$ are subpopulation proportions with $\sum \pi_i = 1$), how do we go about simultaneous estimation of $\theta_1$ and $\theta_2$ with the following properties –

  1. $\theta_1$ and $\theta_2$ have a beta prior distribution
  2. $\pi_1$ and $\pi_2$ have Dirichlet prior distribution
  3. $\pi$ is unknown, $\theta$ and $\pi$ are independent
  4. Estimation loss is sum of component losses and component loss is squared error loss,
    $L(\theta_i, \hat \theta_i) = (\theta_i, \hat \theta_i)^2$

How do we find the Bayes estimator of $\theta$ = ($\theta_1$, $\theta_2$) and its posterior expected loss?

Any solved example/direction/text/links would be helpful!

Tim
  • 108,699
  • 20
  • 212
  • 390
Guest
  • 13
  • 2
  • What is your observed data ? – J. Delaney Feb 08 '22 at 11:55
  • Apologies for not including this before. Data collected from uncontrolled sampling (random variable observed from the entire population without controlling the subpopulation sampled). The structure of the data can be assumed as follows – Given π and θ, Xn equal to the I X 2 matrix with elements {Xijn} has a multinomial distribution (n trials) with cell probabilities θiπi for cells (i, 1) and (1 - θi) πi for cells (i, 2). Also, given π, (X1.n, ..., XI.n) has a multinomial distribution (n trials) with cell probabilities π1, ……, πi . In our case, we assume i/I=2 sub populations. – Guest Feb 08 '22 at 17:43

1 Answers1

0

What you are describing is a mixture of Bernoulli random variables. First, let's start with a minor correction, if you have two sub-populations, than for the $\pi_i$ mixing proportions you don't need Dirichlet distribution, just use the beta distribution so the proportions are

$$ \pi \sim \mathsf{Beta}(\alpha_\pi, \beta_\pi) $$

where the mixing proportions for the subgroups are $\pi$ and $1 - \pi$ respectively. In such a case, your model is

$$\begin{align} \theta_i &\sim \mathsf{Beta}(\alpha_{\theta_i}, \beta_{\theta_i}) \\ \pi &\sim \mathsf{Beta}(\alpha_\pi, \beta_\pi) \\ y_i &\sim \pi \; \mathsf{Bern}(\theta_1) + (1 - \pi) \; \mathsf{Bern}(\theta_2)\\ \end{align}$$

As with other mixtures, it does not have a closed-form solution. The usual approach would be to fit it using maximum likelihood, or Bayesian estimation. To estimate the parameters, you either need to use the E-M algorithm (see e.g. those slides), or MCMC sampling in the Bayesian approach (e.g. this paper). If you decide to use Bayesian approach, beware of the label switching problem, that would usually need some special precautions. After you found the posterior distribution, if you are interested in the point estimate that minimizes the squared error, just take the posterior mean as it minimizes the squared error.

Tim
  • 108,699
  • 20
  • 212
  • 390