A convex set includes all points lying between any two points from the set. A convex function on such a set is a function lying below any straight line connecting two points from its graph. Convex optimization is concerned with searching for the minimum of such a function.
Questions tagged [convex]
136 questions
33
votes
6 answers
Why study convex optimization for theoretical machine learning?
I am working on theoretical machine learning — on transfer learning, to be specific — for my Ph.D.
Out of curiosity, why should I take a course on convex optimization?
What take-aways from convex optimization can I use in my research on…

Upendra01
- 1,566
- 4
- 18
- 28
30
votes
6 answers
For convex problems, does gradient in Stochastic Gradient Descent (SGD) always point at the global extreme value?
Given a convex cost function, using SGD for optimization, we will have a gradient (vector) at a certain point during the optimization process.
My question is, given the point on the convex, does the gradient only point at the direction at which the…

CyberPlayerOne
- 2,009
- 3
- 22
- 30
23
votes
1 answer
Why is the cost function of neural networks non-convex?
There is a similar thread here (Cost function of neural network is non-convex?) but I was not able to understand the points in the answers there and my reason for asking again hoping this will clarify some issues:
If I am using sum of squared…

Luca
- 4,410
- 3
- 30
- 52
23
votes
3 answers
Can there be multiple local optimum solutions when we solve a linear regression?
I read this statement on one old true/false exam:
We can get multiple local optimum solutions if we solve a linear
regression problem by minimizing the sum of squared errors using
gradient descent.
Solution: False
My question is, which part of…

Anjela Minoeu
- 331
- 1
- 2
- 6
17
votes
3 answers
Is PCA optimization convex?
The objective function of Principal Component Analysis (PCA) is minimizing the reconstruction error in L2 norm (see section 2.12 here. Another view is trying to maximize the variance on projection. We also have an excellent post here: What is the…

Haitao Du
- 32,885
- 17
- 118
- 213
12
votes
4 answers
How to Apply the Iteratively Reweighted Least Squares (IRLS) Method to the LASSO Model?
I have programmed a logistic regression using the IRLS algorithm. I would like to apply a LASSO penalization in order to automatically select the right features. At each iteration, the following is solved:
$$\mathbf{\left(X^TWX\right)…

Wok
- 1,065
- 10
- 22
11
votes
3 answers
Is Cross entropy cost function for neural network convex?
My teacher proved that 2nd derivate of cross-entropy is always positive, so that the cost function of neural networks using cross entropy is convex. Is this true? I'm quite confuse about this because I've always learned that the cost function of ANN…

xuancanh
- 113
- 1
- 1
- 6
8
votes
1 answer
Relating $f(\mathrm{Var}[X])$ to $\mathrm{Var}[f(X)]$ for Positive, Increasing, and Concave $f(X)$
The arrival of photons at a pixel in an image sensor is a Poisson distributed random variable such that the input can be modeled as a Poisson r.v. $X\sim \mathrm{Poisson}(\lambda)$.
Since the input is Poisson, the mean and variance are equal such…

Aaron Hendrickson
- 588
- 2
- 10
7
votes
2 answers
Convex optimization: why so much effort on it?
Why so much effort spent for solving convex programs?
For me (I am a newbie) all convex methods (complex maybe) look like a variations of derivative based methods which are not fundamentally better than dull gradient descent.
In the same time there…

Marat Zakirov
- 483
- 5
- 14
6
votes
2 answers
In online convex optimization, what is a leader in FTL algorithm?
I am currently reading into online convex optimization. Can someone please explain me what exactly is a leader in the Follow-The-Leader algorithm and its variants?
Why is it called Follow-The-Leader?

Fraïssé
- 961
- 2
- 13
- 29
5
votes
2 answers
Minimum expectation
The random variable $X$ has a continuous distribution. For an increasing density function $f(X)$ defined in the interval [0,1], what can be the minimum value of its expectation ${E} (X)$?
user312739
5
votes
1 answer
Are there any "convex neural networks"?
Are there any neural network training procedure that involves solving a convex problem?
Note that I am referring more to MLPs, instead of (multi-class) logistic regression which is a neural network with no hidden layers.
I know that for MLPs, if…

Fraïssé
- 961
- 2
- 13
- 29
5
votes
1 answer
Optimization: Convex function
Problem statement
Use the definition of convexity of a function, i.e., that for any $\boldsymbol{x}$, $\boldsymbol{y} \in \mathbb{R}^{d}$ and $\lambda \in \left [0,1 \right ]$ we have
\begin{align*}
f(\lambda \boldsymbol{x}…

Stochastic
- 799
- 1
- 6
- 28
5
votes
2 answers
L1-regularization enforces sparsity for convex functions
I have a convex function $f \colon \mathbb{R}^n \to \mathbb{R}$ that I minimize using L1-regularization:
$\DeclareMathOperator*{\argmin}{arg\,min}$
$$
x^*=\argmin_x f(x) + \lambda ||x||_1
$$
Can I assume that I achieve sparseness, i.e., that $x^*…

Peter
- 103
- 6
5
votes
1 answer
Online convex optimization: Why use strongly convex regularizer for regularized-follow-the-leader instead of strictly convex (or just convex)
I've read through Hazan's paper on online convex optimization. I don't quite understand why the regularization term must be strongly convex instead of more relaxed condition such as strictly…

Fraïssé
- 961
- 2
- 13
- 29