Under some strict assumptions, regression of the outcome on the treatment and covariates does indeed control confounding by those covariates. See Schafer & Kang (2008) for more details. This approach is indeed called ANCOVA. This works because the interpretation of the coefficient on treatment is the effect of treatment holding constant the other variables. Holding those variables constant means their confounding effects are no longer in effect.
The assumptions required are very strict, though. First, you must assume the covariates are sufficient to remove confounding and do not induce confounding. You need a causal theory to justify this, and it isn't empirically verifiable. Second, you need the effect of the covariates on the outcome to be eaxactly as modeled; any nonlinear relationships or interactions must be accounted for. Third, there must not be any moderation of the treatment effect by the covariates, which is extremely unlikely. You can estimate an average treatment effect in the presence of moderation by interacting the treatment with the mean-centered versions of the covariates. Fourth, there must not be measurement error in the covariates or treatment. If there is, the coefficients will be biased in unpredictable directions (but most often downward).
The benefit of other methods like IPW and propensity score matching is to avoid some of these assumptions or replace them with others. For example, propensity score methods require that you have modeled the probability of treatment correctly. With this, you're trading one modeling assumption (that you have correctly modeled the outcome) for another modeling assumption (that you have correctly modeled the treatment assignment process). You don't need to make assumptions about the moderation of the treatment effect, which is one reason to prefer propensity score-based methods. You still need to ensure you've collected and included the right variables and that they are measured without error.
Note that the causal inference field is way beyond using linear regression to control for confounding. See my post here for contemporary methods.
For a little deeper explanation on how regression works:
When you regress Y
on A
(treatment) and X
(covariates), you get a coefficient for A
that can be interpreted as the unique effect of A
on Y
holding constant the covariates. Another way to see this is the followong:
Regress Y
on X
. Take the residuals, R_Y
. This is the part of Y
that is independent from the covariates X
. Now do the same with A
: regress A
on X
(using a linear model, even though the actual assignment model might be nonlinear, e.g., logistic). Take the residuals R_A
. This is the part of A
that is independent from the covariates X
. So now you have two variables, R_Y
and R_A
, that are completely purged of their (linear) association with the covariates X
. If you regress R_Y
on R_A
, you get a coefficient. This coefficient is exactly equal to the coefficient you would get on A
if you were to regress Y
on A
and X
. This interpretation hopefully makes it clearer why including covariates in a regression of the outcome on treatment removes the confounding effects of the covariates (again, only under certain very strict assumptions).
Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13(4), 279–313. https://doi.org/10.1037/a0014268