How is the function learned in a Deep Neural Network non-convex?

Question

The function learned by a Deep Neural Network is essentially composition of different functions. For ex. in CNN first function is convolution (linear function), max-pooling (convex function) followed by non-linearity (sigmoid, which is convex) and so on. The basic functions are convex than how come their composition of non-convex?

I read in convex optimization that composition of convex functions is convex, they why do we always say that DNN's have non-convex energy?

@kjetil b halvorsen pointed out that the sigmoid function is not convex. Beyond that, the statement which you allegedly read "that composition of convex functions is convex" is false, unless additional assumptions (restrictions) are made. For example, exp(-x) is convex, but its composition with itself, exp(-exp(-x)) is neither convex nor concave. See section 3.2 "Operations that preserve convexity" in Boyd and Vandenberghe "Convex Optimization", which is available for free at http://stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf . — Mark L. Stone, May 08 '16 at 15:10
Hi, I post an answer on similar topic here - https://stats.stackexchange.com/questions/106334/cost-function-of-neural-network-is-non-convex/290691#290691 I will be glad if we can chat in comments about my misunderstand (or any extra thoughts). — Konstantin Burlachenko, Jul 10 '17 at 01:20

score 1 · Accepted Answer · answered May 08 '16 at 19:18

As pointed out in comments, what you have read you must have misunderstood. Apart from that, neural networks are universal approximators, that is, can approximate uniformly any continuous function on compacts (that is for one-internal layer networks, deep nets can possibly approximate even more functions). If neural networks always gave convex functions, that property would fail!

How is the function learned in a Deep Neural Network non-convex?

1 Answers1