Exchangeability, causal inference, and partial pooling

Question

In Statistical Rethinking, Richard McElreath writes the following concerning the use of partial pooling (i.e. varying/random effects) in Bayesian hierarchical models:

Could we also use partial pooling on the treatment effects? Yes, we could. Some people will scream “No!” at this suggestion, because they have been taught that varying effects are only for variables that were not experimentally controlled. Since treatment was “fixed” by the experiment, the thinking goes, we should use un-pooled “fixed” effects. This is all wrong. The reason to use varying effects is because they provide better inferences. It doesn’t matter how the clusters arise. If the individual units are exchangeable — the index values could be reassigned without changing the meaning of the model— then partial pooling could help.

My understanding of the concept of exchangeablity is that it is not so much a property of a variable per se but rather a property of observations in relation to a variable, given a causal model. That is, exchangeability obtains when the group membership $X$ of individual observations $x_i$ can be reshuffled without altering the predicted distribution of the response variable $Y$. Or, to rephrase it (correct me if I'm wrong), exchangeability obtains when the group membership of $x_i$ can be reshuffled without altering the inferred effect $Y$ ~ $X$.

Thus, McElreath argues, it doesn't matter whether $X$ is a "nuisance" variable (e.g. study site, subject ID, experimental block, etc.) or a substantive variable (e.g. sex, religious identity, or even experimental treatment). If you can reshuffle $x_i$ across $X$ without altering the predicted distribution of $Y$, you should estimate $Y$ ~ $X$ using partial pooling.

If I understand this correctly, the only reason why exchangeability would not obtain for $Y$ ~ $X$ is if the effect of $X$ on $Y$ were confounded by some other variable. Given a confound, reassigning $x_i$ across $X$ could change the distribution of $Y$ because the observations would be differently affected by the confounding variable. Thus, it would seem that the concept of exchangeability merges with the concept of causal inference, and the decision of whether to use partial pooling hinges on whether $Y$ ~ $X$ is unconfounded (either marginally or conditionally).

Am I understanding this correctly? To summarize, I have the following questions:

If, given a causal model, the total effect of $X$ on $Y$ is unconfounded — either marginally or conditional on a set of covariates — is $x_i$ necessarily exchangeable with respect to $X$?
If the total effect of $X$ on $Y$ is unconfounded — implying exchangeability — should the effect $Y$ ~ $X$ always be estimated using partial pooling?
Are "exchangeability" and "causal inference" just two sides of the same coin with respect to a given $Y$ ~ $X$ model? If so, any model claiming to make causal inference with respect to $Y$ ~ $X$ can and should estimate the model using partial pooling, right?

Thank you for your input. I know exchangeability is much-discussed on SE, but I have never found an answer to this particular formulation of questions.

Just a comment on the quoted block... In most cases, a treatment effect is the coefficient of a two-level factor, treatment vs. control. What's the point of using a random effect / partial pooling for a sample of size 2? Agree it's possible, but feels very odd. — JTH, Jan 18 '22 at 16:06
@JTH yes, I can see how this would be a moot point in the some fields. In my field of ecology, though, we often deal with multi-level grouping variables that are the focus of inference. For example, we might want to model the effect of land management schemes (X1, X2, X3, etc.) on local species richness. According to convention, land management would be treated as a fixed effect. But according to my understanding of McElreath, land management should be treated as a varying effect, as long as it is not confounded. — dbspon, Jan 19 '22 at 07:20
This question approaches the same issue of whether to use fixed or varying effects for substantive variables, but without reference to exchangeability as the criterion for making that decision: https://stats.stackexchange.com/questions/477313/why-is-it-ok-to-model-demographics-as-random-effects-in-bayesian-multilevel-mode — dbspon, Feb 01 '22 at 11:45

Exchangeability, causal inference, and partial pooling

0 Answers0