Model selection using p-values - tree inference

Asked Nov 27 '18 at 13:05

Active Nov 27 '18 at 13:10

Viewed 47 times

Suppose I have some i.i.d. normal observations from $\mathbb{R}^f$ with parameters $(\mu, \Sigma)$ and $\Sigma$ is known to be the identity matrix.

I have the following hypotheses: $H_0^i$: $\mu_i = 0$

I calculate the statistics $ T_i = \frac{1}{\sqrt{n}} \sum_j^n x_{ji}$ which are $N(0, 1)$ under $H_0^i$, and the corresponding p-values.

I combine the test statistics/p-values in some way and test the null-hypothesis $H_0 = \bigcap_i H_0^i$.

If I can't reject, I declare $\mu = 0$. If I'm able to reject, I choose the $ T_i$ with the lowest p-value, say $i = 3$, and declare $\mu = [0, 0, \frac{1}{n} \sum_j^n x_{j3}, 0, \dots ]$.

This seems like a bad idea:

etc.

But maybe it's a good idea? https://stats.stackexchange.com/a/207396/142710

I'm reading a famous paper "Unbiased Recursive Partitioning: A Conditional Inference Framework" by Hothorn, Hornik and Zeileis, and if I'm reading it correctly, this is exactly what they're advocating for.

My question is, is it sometimes acceptable to compare p-values for model selection? What is the right way to think about them in this case?

edited Nov 27 '18 at 13:10

asked Nov 27 '18 at 13:05

user357269

Model selection using p-values - tree inference

0 Answers0