Let $(X\perp Y | Z)_P$ represents the conditional independence of X and Y separated by Z. I am very confused about the following theorem about d-separation from Judea Pearl's text which says the following:
"For any three disjoint subsets of nodes (X, Y, Z) in a DAG G and for all probability function P, if $(X\perp Y | Z)_P$ holds in all distributions compatible with G, it follows that $(X\perp Y | Z)_G$."
Where Pearl used $(X\perp Y | Z)_P$ to distinguish between the probabilistic notion of conditional independence $(X\perp Y | Z)_P$ from the graphical notion of d-separation, $(X\perp Y | Z)_G$.
My confusion is the following:
1) What is the difference between $(X\perp Y | Z)_P$ and $(X\perp Y | Z)_G$? I know Pearl says it's to distinguish between the probabilistic notion of conditional independence from the notion of d - separation, but exactly what does that even mean?
2)B does not d-separate A and D in the image below (i.e. $(A \not\perp D | B)_G$). My question is, what is $(A \not\perp D | B)_P$? is it the following?:
$(P(A,B,C,D) = P(A)*P(B|A)*P(C|B)*P(D|CA)$
where P(D|CA) $\neq$ P(D|C)*P(D|A)