1

I created a dummy dataset and compared the performance of SKLearn LinearRegression and Keras. Why is Keras producing horrible results compared to Linear Regression?

Code:

# Create Dataset
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=5000, n_features=10, noise=0.1)

# Build Linear Regression
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
lr = LinearRegression()
lr.fit(X,y)
prediction_lr = lr.predict(X)

# Build Keras Linear Regression
from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Dense(1, activation='relu', input_dim=10))
model.compile(optimizer='rmsprop', loss='mse')
model.fit(X,y, epochs=100, verbose=0)
prediction_nn = model.predict(X)

print(f'LR MSE: {mean_squared_error(prediction_lr, y)}')
print(f'NN MSE: {mean_squared_error(prediction_nn, y)}')

Output:
LR MSE: 0.010068399696132291
NN MSE: 26936.27829985695

Why is there a dramatic difference of MSE? How can we replicate Linear Regression using Keras?

Thanks

thdwjdxor
  • 11
  • 2
  • You have a `relu` activation function in your output layer. You probably want a `linear` activation function that allows for values less than zero. // I disagree with the closure and think this is a specific problem that could generate an answer getting into what a neural network does to replicate linear regression (and I think this because I am writing such an answer in my head). // `sklearn` uses regularization by default, though regularization can be disabled (unlike in older versions). Disable the regularization to do ordinary least squares. – Dave Feb 07 '22 at 17:27

0 Answers0