Has there been or is there a consensus about how permutation testing should be done in multiply adjusted regression analyses? I understand the notion of "iteratively permuting the outcome variable" so as to simulate a distribution of the data according to the null hypothesis. In doing this, test statistic's distribution under the null can be obtained and used to obtain a p-value.
Suppose we are fitting the multivariate regression model:
$$E[Y|X, W] = \beta_0 + \beta_1 X + \beta_2 W$$
and we are interested in testing the hypothesis:
$$\mathcal{H}_0: \beta_1 = 0 \quad \text{vs} \quad \mathcal{H}_1: \beta_1 \ne 0$$
If one were to fit the two regression models:
$$E[Y|W] = \beta^*_0 + \beta^*_2 W$$
the $\beta_2$ may be entirely different from $\beta_2^*$ because of the causal relationship between $X$, $W$, and $Y$. So it prompts the question: what is $\beta_2$, conditional mean difference between $W$ and $Y$, supposed to be according to the null hypothesis? I am worried that permuting $Y$ throws the baby out with the bathwater so to speak.
If $W$ confounds the relationship between $X$ and $Y$, in an analysis not adjusting for $X$, $W$ should still have a causal relationship with the outcome of interest (on top of being correlated/causally related to $X$). So this suggests that under the null, $\beta_2$ should at least not be zero even if it is not $\beta_2^*$ per se though that is the most logical assumption.
However, if we randomly permute the labels of $Y$ according to the assumptions of the permutation test, the $\beta_2$ obtained in the permuted distribution is not $\beta_2$ but it's 0.
If $W$ were categorical say binary 0/1, the most logical way to obtain a consistent permutation test would be doing something like permuting $Y$, only within clusters of $W$. (call $Y^{(*)}$ the permuted $Y$) so that $E[Y|W=0] = E[Y^{(*)}|W=0]$ and $E[Y|W=1] = E[Y^{(*)}|W=1]$ and since $\text{cor}(Y^{(*)}, X) \approx 0$, and therefore $\beta_2|\mathcal{H}_0\text{ is true} = \beta_2^*$.
But there is no analogue for continuous $W$.