1

Imagine a number of variates $x_i$, and a number of processes $P_k$ which depend on these variables, in an unknown way (ie no clear cut formulas to work with).

Now consider the scenario where you can only get snapshot measurements of a subset of $x_i$ (let's refer to them as $x_{obs}$) and based on that you would like to be able to speculate on whether or not $P_k$ are subject to a systematic change (as opposed to just fluctuations) as $x_{i}$ varies from a baseline value.

For that purpose I devised a model that outputs a value (continuous variable); $y_k = f(x_{obs})$, which is intended to reflect on the level of change in $P_k$. However since $y_j$ are arbitrary values with no real meaning, instead of reporting them directly, I choose to do a Monte Carlo approach, where I sample a large number of mock $x^{n}_{obs}$ from the same empirical distribution as $x_{obs}$ and calculate a distribution of the scores $y_k$. I then report the probability of observing a particular value $p_k := P(y_k \geq A)$, given the estimated distribution.

Question 1: how should I be referring to be my $p_k$ values in tables and figures. I am not sure what the correct terminology would be:

  • "Probability": would not be correct as we don't know anything about the "real" probabilities. Essentially, since the outcome is either "yes, subject to meaningful change" or "no, random fluctuations". The only way it makes sense is to say "the probability of the process being subject to meaningful change, given the model used". Quite frankly that's a mouthful and doesn't really work for legends.

  • "Likelihood:" pretty much the same as above, I guess. I checked this great question to learn more about the distinction between the two, but I am not sure if it applied to the case at hand.

  • "Significance": I like this option the most, as it essentially portrays what the probabilities represent, the significance of observing a particular value out of the model. But seeing as how the word is used very liberally, I'm afraid it might be misunderstood and cause me headache down the line.

Question 2: If I have several thousands of $P_k$ would I need to FDR correct my $p_k := P(y_k \geq A)$? My initial instinct is "no" because there is no formal testing taking place here, but recently I was asked if it wouldn't be necessary to FDR correct and I couldn't fully justify my answer, which got me thinking...

posdef
  • 739
  • 8
  • 24

0 Answers0