0

So my goal is to minimize

$$\frac{1}{n} \sum_{i=1}^n (y'_i - y_i)^2$$

Where $y'$ is output of network and $y_i$ is a target label.

I have two questions:

  1. What is the name of this minimization function? (Least sum of squares?)

  2. If I want to implement it in neural networks, what loss function do I use?

Thank you!

YohanRoth
  • 123
  • 4

2 Answers2

2

This is called mean-squared-error loss.

If you try to use this loss, and train the model with gradient descent, you may run into a problem. This is because it sounds like you have a classification task, since you write about "labels". A neural network for classification with no hidden layer and softmax outputs is exactly a logistic regression. If you attempt to use mean-squared-error loss to estimate a logistic regression, you'll run into problems because this optimization task is not convex.

Fo more information, see What is happening here, when I use squared loss in logistic regression setting?

Sycorax
  • 76,417
  • 20
  • 189
  • 313
0
  1. This equation is known as mean squared error (mean squared loss).

  2. This equation is helpful in situations where y' and yi both are real values (regression problems). Else in classification problem this equation totally fails, Here we try to minimize the mean difference between actual value and the predicted value. If we take a classification problem and use the same equation we can observe that yi is going to have only 2 values (for binary classification) or n values where n = number of classes. In classification problems to try to estimate correctly the probability of class, this equation does not handle the error in probability. If this equation is applied to a classification problem user is destined to run into a problem.