Considering the conversation in "Does y = f(x) imply x granger-causes y", I have a deeper question about Granger-Causality.
Suppose I have a leaf flying in the wind and it can only fly back and forth in one direction. I know that this leaf is flying according to the law F = ma, where F is the sum of the forces acting on the leaf, m is the mass (a constant), and a is the acceleration of the leaf. Suppose there are two forces acting on the leaf, $F_p$ (the pressure gradient) and $F_\mu$ (the viscous stress), and I have the time series for $a(t)$ (where $a(t) = \frac{1}{m}\left( F_p + F_\mu \right)$), and the time series data for $F_p(t)$, and $F_\mu(t)$.
Now suppose I perform an autoregression on $a(t)$ and find that the absolute best possible model is an AR(3). Let's call it $a(t) \approx r_3$. I do the same for the forces and find that they are also AR(3). Let's call them $F_p(t) \approx p_3$, and $F_\mu(t) \approx \mu_3$, respectively. My understanding of granger-casuality is this: If $F_p$ granger-causes $a(t)$ then incorporating one or more terms from the AR model of $F_p$ improves the prediction of $a(t)$ when compared to the AR model of $a(t)$ alone. In other words, $a(t) \approx p_3 + r_3$ must be better at predicting $a(t+1)$ than $a(t) \approx r_3$.
Question 1:
But if $r_3$ is already the best possible AR model of $\frac{1}{m}\left( F_p + F_\mu \right)$, then how could the inclusion of extra terms possibly improve that fit? (clearly that's circular, but where is the flaw?) The broader question being: if I fit an AR model of the effect, how could additional terms from the AR models of the causes possibly improve the AR model of the effect, unless there is already some flaw or limitation in the procedure for generating an AR model of the effect?
Question 2:
Is it possible to improve that fit without including terms from the AR models of both $F_p$ and $F_\mu$?