I have 3 questions about ARFIMA-* models forecasting.
Let's look at standard stationary non-seasonal ARFIMA model representation via coefficients.
$$ \left(1 - \sum_{i=1}^p\phi_{i} L\right) \left(1 - L\right)^{d_{frac}}\left(x_{t} - \mu \right) = \left(1 + \sum_{j=1}^{q} \theta_{j}L\right)a_{t}$$, where $a_{t}$ - white noise.
Let's define $\left(1 - \sum_{i=1}^p\phi_{i} L\right) = \Phi\left(L\right)$, $\left(1 + \sum_{j=1}^{q} \theta_{j}L\right) = \Theta\left(L\right)$. In this case we could write $$ x_{t} = \mu + \frac{\Theta\left(L\right)}{\Phi\left(L\right)\left(1-L\right)^{d_{frac}}}a_{t} = \mu + \sum_{k \geq 0}\psi_{k}a_{t-k}$$ - it's an infinite MA representation of our stationary process.
We could rewrite it as $x_{t} = I_{k}\left(t - k\right) + C_{k}\left( t- k\right)$ (according to (Box, Jenkins) notation), where $C_{k}\left(t-k\right)$ is a complimentary function and $I_{k}\left(t - k\right)$ is a particular integral an could be represented as $I_{k}\left(t-k\right) = \sum_{j=0}^{t-k-1}\psi_{j}a_{t-j}$.
So, the first question.
What's the representation of nonstationary non-seasonal ARFIMA model? We could write integrated process as: $\left(1-L\right)^{d_{int}}\left(x_{t} - poly_{t}\right) = a_{t}$, where $poly_{t} = \sum_{i=0}^{d_{int}} \frac{\mu_{i}}{i!} t^i$, but how to calculate $\mu_{i}$ if we have only $\mu$ when we calculate coefficients of our model for differenced series.
Second question:
Let's have $\hat {x}_{t}\left(l\right)$ - forecast value of model for step $l$. Is this correct that $x_{t+l}$ could be writted as $x_{t+l} - poly_{t+l}= \hat {x}_{t}\left(l\right) + e_{t}\left(l\right)$, where $e_{t}\left(l\right)$ is a forecast error or it could be written as $x_{t+l} - poly_{t+l}= \hat {x}_{t}\left(l\right) - poly_{t} + e_{t}\left(l\right)$
Third question. Each ARMA model with non-zero mean could be written as $$ \left(1 - \sum_{i=1}^p\phi_{i} L\right) \left(x_{t} - \mu \right) = \left(1 + \sum_{j=1}^{q} \theta_{j}L\right)a_{t} =>$$ $$\left(1 - \sum_{i=1}^p\phi_{i} L\right) x_{t} = \theta_{0} + \left(1 + \sum_{j=1}^{q} \theta_{j}L\right)a_{t}$$ with $\theta_{0} = \mu \left(1 - \sum_{i=1}^p\phi_{i} \right)$. We could write model $$\left(1 - \sum_{i=1}^p\phi_{i} L\right) x_{t} = \theta_{0} + \left(1 + \sum_{j=1}^{q} \theta_{j}L\right)a_{t}$$ as a model with a non-zero white noise: $$\left(1 - \sum_{i=1}^p\phi_{i} L\right) x_{t} = \left(1 + \sum_{j=1}^{q} \theta_{j}L\right)\xi_{t}$$, where $\xi_{t}$ are i.i.d. with $E\xi_{t} = \frac{\theta_{0}}{\left(1 + \sum_{j=1}^{q} \theta_{j} \right)}$
So, how could it be calculated for stationary/non-stationary ARFIMA model with non-zero fractional differencing parameter?
I can't prove that the value of
$$\frac{\Theta\left(1\right)}{\Phi\left(1\right)\left(\sum_{k\geq0}{d \choose k}\left(-1\right)^{k}\right)}$$ will converge for $-0.5 < d < 0.5$. And other question - if it is, how many coefficients of $\frac{1}{\sum_{k\geq0}{d \choose k}\left(-L\right)^{k}}$ should I calculate for well definition of $\theta_{0}$?
Thank you.