I am confused after studying diverse educational material about Structural Equation Modeling (SEM) and Bayesian Networks (BN) over the last years. Others also seem to experience a similar issue, e.g. here or here.
As a result of this confusion, I am unsure how to approach an analysis result for the problem I show below. The methods of my analysis should be closest if not similar to what would be referred to as BN analysis in line with the material of Statistical Rethinking: I create a DAG based on my knowledge of a system, then I test the DAG by analysing whether conditional independencies are according to the DAG.
In this particular case, my supervisor did a literature research and came up with the DAG shown. He asked me to examine whether X has an effect on Y and if so, whether this effect would be (partially) mediated by Z1 and Z2. He also asked me if I could report my result by showing all path coefficients in this DAG (just assuming it was true) while he does not care whether I use SEM or BN. It is a bit ironic but having had courses in both methods led to a confusion that was not there or at least much less when studying either method. I would appreciate if someone could read below explanation and tell me if this is the (or a) correct procedure for mediation analysis in the BN framework.
If I understand correctly, in a BN:
Total effect = Direct effect + indirect effect
A path coefficient should always be the direct effect. For example, for the arrow X → $Z_2$, the total and the direct effect are equal according to the DAG and I would run a regression model $Z_2 \sim X$ and the $\beta$ coefficient $\beta_X$ will be my path coefficient.
As another example, for $Z_2$ → Y, the total effect differs from the direct effect and to obtain the path coefficient (direct effect) I need to run $Y \sim Z_2 + Z_1 + X$. I need to include $Z_1$ because otherwise I get the total effect and I always need to include X as otherwise my estimate will be biased in any case (assuming the DAG).
So, basically to obtain all path coefficients, I have to calculate all direct effects, given the DAG, is that correct?
For the mediation analysis, I first need to calculate the total effect X → Y. If there is a relationship, then I include Z1 and Z2 to see if this relationship disappears while a relationship Z1 → Y and/or Z2 → Y must show. Calcuating the path coefficients will help to illustrate the relationships among the variables.
To calculate all direct effects, I need to run many linear models. Is it necessary that I run this analysis at once (e.g. specify all the regressions in Stan and then run them in one go) or is it OK to estimate each path coefficient by a separate Bayesian regression (keeping priors for shared parameters between models constant)? E.g. in Statistical Rethinking, that is how a Mediation Analysis is shown, although there it was not necessary to calculate all path coefficients.