The post about how to choose the amount of hidden layers and neurons was extremely helpful. The rules of thumb given gave me often a good point to start. However, I'm now thinking about varying the number of Neurons per layer.
How should one choose different number of neurons per layer? Any rules of thumb? And is there maybe a good explanation how e.g. a bottleneck or an enlargement of a layer in between influences the network?
EDIT 1:
Some more background information about the problem. I'm working on some Reinforcement Learning Projects. I used Q-Learning so far and am now trying some DQN stuff. I started with OpenAi Gyms Cartpole game to test some basics. There are currently an input vector of length 4 and an output vector of length 2. With the rule of thumb I've chosen 2 hidden layers. 7 neurons per layer worked best so far. The loss is MSE and I train with an ADAM optimizer. As Activation function a ReLu is used.
However, some more general answers would be great to, since this is only a small test project for later work.