0

Say we observe data $D$, which comes from a probability distribution $P[D|\theta]$, where $\theta$ are the unknown model parameters. Given this information, what is the probability distribution of the future data $D'$?

Further questions:

  1. Does this aim have a specific name in the literature?
  2. Can this question be addressed using Frequentist statistics?
  3. For Bayesian statistics I came up with the following procedure
    • Find posterior $P[\theta|D]$
    • Find $P[D'|D] = \int_\theta P[D'|\theta]P[\theta|D]d\theta$

Does this approach make sense? Is this what people typically do?

Aleksejs Fomins
  • 1,499
  • 3
  • 18
  • 2
    You found the posterior predictive distribution. – Arya McCarthy Apr 14 '21 at 21:26
  • @AryaMcCarthy Thanks, that answers Q3 and Q1 – Aleksejs Fomins Apr 15 '21 at 07:34
  • Wasn't trained in the frequentist arts, but I think their notion is to find $\theta_{\text{MLE}}$ based on $D$, then compute the probability of $D'$ under that model. – Arya McCarthy Apr 15 '21 at 18:46
  • Yeah, that's the obvious solution, but it is of course wrong. Surely you don't need the notion of prior to comprehend the fact that picking the MLE parameter for prediction ignores the uncertainty of MLE. I was wondering if one can do something similar to predictive prior distribution, but instead integrate over the parameter estimator distribution – Aleksejs Fomins Apr 15 '21 at 18:51
  • Once you start integrating out parameters, you've entered Bayesian-land. Is this question based on some external requirement to operate in a frequentist framework, or is it curiosity? – Arya McCarthy Apr 15 '21 at 18:58
  • Yes, I agree. Still, this question is very important and quite common in practice. Given the number of followers of frequentist school I would guess that they must have come up with some way to deal with it. I don't really care too much, just curious – Aleksejs Fomins Apr 15 '21 at 19:01
  • 2
    @Arya McCarthy: Well, we have frequentist [tag:prediction-interval]s, see this post https://stats.stackexchange.com/questions/473512/prediction-intervals-for-a-single-random-variable. They take into account parameter uncertainlt without integrating it out in Bayesian fashion, but typically using a [tag:pivot] in the same way as in constructing a confidence interval. I will try to write an answer! – kjetil b halvorsen Apr 15 '21 at 23:53
  • I believe https://stats.stackexchange.com/questions/26702 might answer #2. For examples, see https://stats.stackexchange.com/questions/17773, https://stats.stackexchange.com/questions/265856, https://stats.stackexchange.com/questions/14515, *etc.* (found with a search for "prediction interval"). – whuber Apr 16 '21 at 20:15

0 Answers0