12

A neural network is a several linear transormations $L_1,\ldots, L_m$ that are sequentialy appilied to feature vector $X$. A compositon of linear transformations is a linear transformation. So after all we get $L X$ where $L$ is a composition of $L_1,\ldots, L_m$.

The question is: if eventually we have that neural network is just applying a liner transformation to a feature vector what is the essential difference betwen neural networks and linear regression

Steffen Moritz
  • 1,564
  • 2
  • 15
  • 22
Eugeny89
  • 175
  • 2
  • 10
  • 1
    http://stackoverflow.com/questions/9782071/why-must-a-nonlinear-activation-function-be-used-in-a-backpropagation-neural-net – SmallChess Feb 04 '17 at 13:22
  • 1
    Most common activation functions for neural networks are sigmoid and hyperbolic tangent, which are not linear transformations. – Łukasz Grad Feb 04 '17 at 13:22
  • The transformation may be linear but the output is almost always transformed by a non-linear function. – SmallChess Feb 04 '17 at 13:24
  • @StudentT but I can apply non-linear transformation to the result of a regression as well – Eugeny89 Feb 04 '17 at 13:26
  • 1
    See https://en.wikipedia.org/wiki/Universal_approximation_theorem which states that you need at minimum 1 hidden layer to approximate any continuous function, so perceptron is not enough – Łukasz Grad Feb 04 '17 at 13:31
  • @Eugeny89 A neural network with only the output transformed non-linearly is a special case which is almost never used in practice. So Yes there is a connection between linear regression and neural networks in a special case. – Hugh Feb 04 '17 at 13:32
  • 1
    FYI [What is the difference between 'regular' linear regression and deep learning linear regression?](http://stats.stackexchange.com/q/253337/12359) – Franck Dernoncourt Feb 04 '17 at 15:32
  • Research funding. – usεr11852 Dec 15 '19 at 16:02

1 Answers1

18

No, a neural network is not several consecutive linear transformations. As you note, that would only result in another linear transformation in the end, so why do many instead of one? Actually, a neural network performs several (at least one, but possibly more, depending on the number of hidden layers) nonlinear (e.g. sigmoid) transformations.

That is also the difference between a neural network and a linear regression, since the latter uses a linear combination of regressors to approximate the regressand.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
  • (minor detail, which I guess you aware of but just for neophytes: A neural network is not *necessarily* several consecutive linear transformations.) – Franck Dernoncourt Feb 04 '17 at 15:36
  • @FranckDernoncourt, thank you. Yes, a network may have $m\geq 1$ layers, so it is of course possible that $m=1$. – Richard Hardy Feb 04 '17 at 16:18