1

I want to use a simple neural network to approximate a linear time-discrete state-space model, given by the equation: $$\boldsymbol{x}_{k+1} = \mathbf{A} \: \boldsymbol{x}_k$$ with $$\boldsymbol{x} = (x_1 \; x_2 \; x_3 \; x_4 )^T$$ I want to approximate $\mathbf{A}$ with a neural network that has 1 layer with 4 neurons, no activation function and no bias. Resulting in the function $$\boldsymbol{x}_{k+1} = \mathbf{A}_{net} \: \boldsymbol{x}_k$$ for the neural network, with the weights $$\mathbf{A}_{net} = \begin{pmatrix} w_{1,1} & \dots & w_{1,4} \\ \vdots &\ddots& \vdots\\ w_{4,1} & \dots & w_{4,4} \end{pmatrix} = \begin{pmatrix} \mathbf{A}_1 & | & \mathbf{A}_2 \\ \hline \mathbf{A}_3 & | & \mathbf{A}_4 \end{pmatrix}$$ where $\mathbf{A}_n$ represent 2x2 matrices. The network has the same structure as the state-space model and is only defined by its weights.

Thus, it should be possible to train the network in such a way that $$\mathbf{A}_{net} = \mathbf{A}.$$ This works, when I have data for all time steps $k=0,1,2,...$ for all components $x_1, x_2, x_3$ and $x_4$.

The challenge is now, that only the components $x_1$ and $x_2$ are available as data for the time steps $k=0,1,2...$. Therefore the training can only consider these two variables. The variables $x_3$ and $x_4$ are only known for the first time step $k=0$. The problem is that under these conditions the optimizer only adjusts the matrices $\mathbf{A}_1$ and $\mathbf{A}_2$.

The question is how to structure the training, such that $\mathbf{A}_3$ and $\mathbf{A}_4$ are also considered by the optimization algorithm.

My code: https://nbviewer.jupyter.org/github/wundi777/ANN/blob/master/Code_for_Question.ipynb

This problem is probably already solved in the context off RNN/LSTM. Recommendations for literature are welcome.

wundi777
  • 11
  • 1

0 Answers0