Not a big deal - it is strongly stationary and approaches white noise
The non-invertible $\text{MA}(1)$ process makes perfect sense, and it does not exhibit any particularly strange behaviour. Taking the Gaussian version of the process, for any vector $\mathbf{y} = (y_1,...,y_n)$ consisting of consecutive observations, we have $\mathbf{y} \sim \text{N}(\mathbf{0}, \mathbf{\Sigma})$ with covariance:
$$\mathbf{\Sigma} \equiv \frac{\sigma^2}{1+\theta^2} \begin{bmatrix}
1+\theta^2 & -\theta & 0 & \cdots & 0 & 0 & 0 \\
-\theta & 1+\theta^2 & -\theta & \cdots & 0 & 0 & 0 \\
0 & - \theta & 1+\theta^2 & \cdots & 0 & 0 & 0 \\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & 1+\theta^2 & -\theta & 0 \\
0 & 0 & 0 & \cdots & -\theta & 1+\theta^2 & -\theta \\
0 & 0 & 0 & \cdots & 0 & -\theta & 1+\theta^2 \\
\end{bmatrix}.$$
As you can see, this is a strongly stationary process, and observations that are more than one lag apart are independent, even when $|\theta|>1$. This is unsurprising, in view of the fact that such observations do not share any influence from the underlying white noise process. There does not appear to be any behaviour in which "past observations increases with the distance", and the equation you have stated does not establish this (see below for further discussion).
In fact, as $|\theta| \rightarrow \infty$ (which is the most extreme case of the phenomenon you are considering) the model reduces asymptotically to a trivial white noise process. This is completely unsurprising, in view of the fact that a large coefficient on the first-lagged error term dominates the unit coefficient on the concurrent error term, and shifts the model asymptotically towards the form $y_t \rightarrow \theta \epsilon_{t-1}$, which is just a scaled and shifted version of the underlying white noise process.
A note on your equation: In the equation in your question you write the current value of the observable time series as a geometrically increasing sum of past values, plus the left-over error terms. This is asserted to show that "the effect of past observations increases with the distance". However, the equation involves a large number of cancelling terms. To see this, let's expand out the past observable terms to show the cancelling of terms:
$$\begin{equation} \begin{aligned}
y_t
&= \epsilon_t - \sum_{i=1}^{t-1} \theta^i y_{t-i} - \theta^t \epsilon_0 \\[6pt]
&= \epsilon_t - \sum_{i=1}^{t-1} \theta^i (\epsilon_{t-i} - \theta \epsilon_{t-i-1}) - \theta^t \epsilon_0 \\[6pt]
&= \epsilon_t - ( \theta \epsilon_{t-1} - \theta^2 \epsilon_{t-2} )
\\[6pt]
&\quad \quad \quad \quad \quad \ \ \ - ( \theta^2 \epsilon_{t-2} - \theta^3 \epsilon_{t-3} ) \\[6pt]
&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad - ( \theta^3 \epsilon_{t-3} - \theta^4 \epsilon_{t-4} ) \\[6pt]
&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \ \ \ - \ \cdots \\[6pt]
&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \ \ \ - ( \theta^{t-1} \epsilon_1 - \theta^t \epsilon_0 ). \\[6pt]
\end{aligned} \end{equation}$$
We can see from this expansion that the geometrically increasing sum of past values of the observable time series is there solely to get the previous error term:
$$\epsilon_{t-1} = \sum_{i=1}^{t-1} \theta^{i-1} y_{t-i} + \theta^{t-1} \epsilon_0.$$
All that is happening here is that you are trying to express the previous error term in an awkward way. The fact that a long cancelling sum of geometrically weighted values of the series is equal to the desired error term does not demonstrate that past observations are having "an effect" on the present time-series value. It merely means that if you want to express $\epsilon_{t-1}$ in terms of $\epsilon_0$ then the only way you can do it is to add in the geometrically weighted sum of the observable series.