It depends what your research question is. I appreciate that you have a causal theory in mind, but it is always good to keep in mind that generally, at least with regression you need to decide what is the main exposure - and you will be typically wanting to estimate the total causal effect for that variable on the outcome. If you want to look at the whole picture and see all the direct and indirect estimates then a structural equation model (path diagram in this case) would be a better approach. However, regression is usually the approach most people choose:
In the model on the left, if $B$ is the main exposure then $A$ and $C$ are mediators and should not be conditioned on. However if $C$ is the main exposure then $B$ is a confounder of the path $C\rightarrow D$ and should be conditioned on. $A$ is a decendent of $B$ so this can be treated as a competing exposure and will increase the precision of the estimate for $B$. Similar logic applies if $A$ is the main exposure.
In the model on the right, if $C$ is the main exposure then you don't condition on either $A$ or $B$ because they are upstream. If A is the main exposure then $B$ confounds the $A\rightarrow C$ path (and hence $A \rightarrow C \rightarrow D$ and) should be conditioned on. If $B$ is the main exposure then $A$ is a mediator and should not be conditioned on.
Take a look here for more on this.
So as to the main question:
Is there a pragmatic way (SEM or other) to show whether the data favours one of these models over the other? If yes, what's the best or most pragmatic way to implement this procedure, e.g. in R?
So bearing the above in mind, yes, you can consider C to be the main exposure. Let's do a simple simulation. First for the model on the left:
rm(list=ls())
N <- 100
B <- rnorm(N)
A <- B + rnorm(N)
C <- B + rnorm(N)
D <- A + B + C + rnorm(N)
as per the discussion above, in this case we condition on B and A:
summary(lm(D ~ C + B + A))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.1293 0.1210 -1.069 0.288
C 0.9208 0.1252 7.355 6.41e-11 ***
B 0.9741 0.2163 4.503 1.88e-05 ***
A 1.0682 0.1181 9.048 1.65e-14 ***
And we get signficant contributions from A and B, as expected.
Now, with the model on the right, A and B still have the same causal structure, so we just need to simular a new C and D:
C1 <- B + A + rnorm(N)
D1 <- C1 + rnorm(N)
Now, if we just wanted the causal effect of C1 on D1 we would not control for A and B, bue here we want to compare the same model we used above:
summary(lm(D1 ~ C1 + A + B))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.031499 0.107806 0.292 0.7708
C1 0.851798 0.104286 8.168 1.25e-12 ***
A 0.005409 0.166517 0.032 0.9742
B 0.291614 0.169539 1.720 0.0886 .
...and now we find the contibutions from A and B drop out.
With an SEM approach:
library(lavaan)
dt1 <- data.frame(A,B,C,D)
m1.true <- '
D ~ A + B + C
C ~ A
A ~ B'
m1.true.res <- sem(m1, dt1)
summary(m1.true.res)
we obtain:
Regressions:
Estimate Std.Err z-value P(>|z|)
D ~
A 1.068 0.116 9.234 0.000
B 0.974 0.212 4.596 0.000
C 0.921 0.123 7.507 0.000
C ~
A -0.080 0.094 -0.855 0.393
B 0.996 0.141 7.057 0.000
and all of this is consistent with the first model. But if we try to fit the second model we get:
m1.false <- '
D ~ C
C ~ A + B
A ~ B'
m1.false.res <- sem(m1.false, dt1)
summary(m1.false.res)
Regressions:
Estimate Std.Err z-value P(>|z|)
D ~
C 2.057 0.173 11.880 0.000
C ~
A -0.080 0.094 -0.855 0.393
B 0.996 0.141 7.057 0.000
A ~
B 1.139 0.098 11.622 0.000
and from this we can see the estimate for C ~ A is small and non significant which is not consistent with the 2nd model. We also see that the estimate for D ~ C is larger then we expected - but we would not know this if we hadn't simulated the data.