Assume we have 500 predictors and one response. Can we perform a univariate regression on each pair Y-X and then select the predictors that have the highest R-squared and p<0.05? After that, we can use the selected predictors in multivariate combinations or step regression etc.
I have not found an answer why this is not advisable and I cannot mathematically prove why intuitively this sounds wrong.
I know that selecting the top-20, for example, predictors based on univariate R-squared does not guarantee that a multivariate model created by any combination of them will be better than a multivariate of the next 20 or the bottom 20.