I'm training a CNN network to find coordinates of an object place on a 2d grid space. The problem is that the error is not going below 3 cm tolerance. My data set consists of 1000 images (240x135 resolution). I'm using a model with 7 conv-maxpool layers and then 5 dense layers (all relu activations).
How can I reduce error further ??
I have tried training for longer durations, used mini-batch GD, used l2 regularization, dropout..