Suppose I have trained a neural net to solve a classification problem with $m$ hidden layers, each having $n_i$ neurons: $$ n_1,\dots,n_m $$ and let's also assume $$ n_1>n_2>\dots>n_m , $$ and that I keep the number of connections $N_c$ constant.
Now if I have found that networks with $m=M$ generally perform worse than a network with $m=M-1$ (i.e. I made a few attempts), does it make sense to try a network with $m=M+1$ or will it generally perform even worse?
What are instead typical signs that suggest adding a hidden layer instead of adding neurons on the same layer might improve performance?
Answers to this popular question on CV:
How to choose the number of hidden layers and nodes in a feedforward neural network?
suggest I should go from 0 to 1 if linear separation does not work for my data but how do I get aware of that?