0

I am training a deep neural network using tensorflow in python. I have some inputs and outputs and I want to train my NN for regression problem. I created my NN (I will explain it later), but I have a problem and that is after some epochs, error becomes almost constant and NN seems does not learn anymore. I've tried many solutions such as:

1 - increasing/decreasing number of layers

2 - increasing/decreasing number of neurons

3 - different optimizers

4 - different learning rates

5 - different activation functions

6 - normalizing inputs and outputs

7 - increasing number of epochs

and …. but I couldn't solve the issue. I will be appreciated if anyone can help me with this issue.

here is my python code:

import pandas as pd
import tensorflow as tf
from tensorflow import keras
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
import numpy as np

input = pd.read_csv("./Data/All_Inputs.txt", delimiter = ",").to_numpy()
output = pd.read_csv("./Data/All_Outputs.txt", delimiter = ",").to_numpy()

x_train = input[0:48000, :]
y_train = output[0:48000, 12:1012]

x_test = input[48000:60000, :]
y_test = output[48000:60000, 12:1012]

x_valid = input[60000:72000, :]
y_valid = output[60000:72000, 12:1012]

scalerInput = MinMaxScaler()
x_train = scalerInput.fit_transform(x_train)
x_test = scalerInput.transform(x_test)
x_valid = scalerInput.transform(x_valid)

scalerOutput = MinMaxScaler()
y_train = scalerOutput.fit_transform(y_train)
y_test = scalerOutput.transform(y_test)
y_valid = scalerOutput.transform(y_valid)

model = keras.models.Sequential()
model.add(keras.layers.Dense(800, activation = "linear", input_shape = x_train.shape[1:]))
model.add(keras.layers.Dense(900, activation = "linear"))
model.add(keras.layers.Dense(900, activation = "linear"))
model.add(keras.layers.Dense(1000,  activation = "linear"))
opt = keras.optimizers.SGD(learning_rate = 0.01, momentum = 0.9)
model.compile(loss = "mse", optimizer = opt, metrics=['mae'])
history = model.fit(x_train, y_train, epochs = 20, validation_data=(x_valid, y_valid))
mse_test = model.evaluate(x_test, y_test)
print("\n\nTest set mean squared error is = ", mse_test)
pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.show()

and this is the history of training.

enter image description here

I also tried more epochs like 100 or 300 but the result was not so different and history showed similar nearly flat behavior of curves.

how can I reduce the error of training furthermore?

Inevitable
  • 93
  • 7
  • This has likely been asked before: https://stats.stackexchange.com/q/365778/99274 – Carl May 10 '21 at 17:41
  • @Carl unfortunately couldn't help me :( – Inevitable May 10 '21 at 18:20
  • To determine your lowest possible training error, consider training your network several times on a **single** example. Then, test your network on the same example. Record your training and test error. Repeat this for two examples, then four examples, and so on as necessary. Finally, once you know what training and test errors you are looking for, start training and testing on separate but **small** (~10 examples) datasets. If this works well, start to gradually increase the size of your training and testing datasets. – mhdadk May 10 '21 at 19:28
  • This is discussed in more detail in [this answer](https://stats.stackexchange.com/a/352190/296197). – mhdadk May 10 '21 at 19:29
  • @mhdadk I tried to overfit my problem with small data set and complex NN, but i got nearly the same error. ( i couldn't overfit my model) – Inevitable May 12 '21 at 11:55
  • Please check that you are performing preprocessing correctly by reading [this](https://scikit-learn.org/stable/modules/preprocessing.html) guide very carefully. If all else fails, try re-implementing your network and following my previous advice using [PyTorch](https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html) instead. If even that doesn't work, check your small dataset very carefully. – mhdadk May 12 '21 at 12:34
  • @mhdadk Hi again :) . I tried my problem on very small set of data and also on whole data and I got 2 things: 1 : I can reach overfit on my model at high epochs. 2 : I can not reduce error further than nearly specific value. lowest error I can get on validation is almost similar in any configuration of neural network and is about MAE = 0.21 – Inevitable May 24 '21 at 10:34

1 Answers1

0

After along time, now I'm back again in order to answer my own question maybe somebody in the future ( or maybe in the past ;) can get intuition from that. The reason for the fact that my training error become almost flat and doesn't get improved is the nature of my data. After examining thoroughly into my data I found that outputs of my neural network are close together and for this reason machine can't learn anymore. Suppose you are training a model on images and all the pixels in the image has close value in RGB together. I hope this would be helpful.

Karolis Koncevičius
  • 4,282
  • 7
  • 30
  • 47
Inevitable
  • 93
  • 7