Questions about (linear or nonlinear) least-squares, an estimation method used in statistics, signal processing and elsewhere.
Questions tagged [least-squares]
1786 questions
28
votes
3 answers
Difference between least squares and minimum norm solution
Consider a linear system of equations $Ax = b$.
If the system is overdetermined, the least squares (approximate) solution minimizes $||b - Ax||^2$. Some source sources also mention $||b - Ax||$.
If the system is underdetermined one can calculate…
plasmacel
- 1,222
- 1
- 14
- 28
28
votes
5 answers
Gradient of squared Frobenius norm of a matrix
In linear regression, the loss function is expressed as
$$\frac1N \left\|XW-Y\right\|_{\text{F}}^2$$
where $X, W, Y$ are matrices. Taking derivative w.r.t $W$ yields
$$\frac 2N \, X^T(XW-Y)$$
Why is this so?
wrek
- 475
- 1
- 7
- 26
27
votes
4 answers
Why does SVD provide the least squares and least norm solution to $ A x = b $?
I am studying the Singular Value Decomposition and its properties. It is widely used in order to solve equations of the form $Ax=b$. I have seen the following: When we have the equation system $Ax=b$, we calculate the SVD of A as $A=U\Sigma V^T$.…
Ufuk Can Bicici
- 2,896
- 2
- 25
- 49
23
votes
3 answers
How does the SVD solve the least squares problem?
How do I prove that the least-squares solution for $$\text{minimize} \quad \|Ax-b\|_2$$ is $A^{+} b$, where $A^{+}$ is the pseudoinverse of $A$?
Elnaz
- 609
- 1
- 6
- 13
16
votes
4 answers
Matrix Calculus in Least-Square method
In the proof of matrix solution of Least Square Method, I see some matrix calculus, which I have no clue. Can anyone explain to me or recommend me a good link to study this sort of matrix calculus?
In Least-Square method, we want to find such a…
Cancan
- 2,657
- 6
- 22
- 30
15
votes
2 answers
Least-squares solution to system of equations of $4 \times 4$ matrices with $2$ unknown matrices
This question is in the context of a robotics problem. The goal is to track a robot using both its onboard odometry system and a VR system (HTC Vive Pro) using a VR controller mounted to the robot.
What is known is the transformation between…
couka
- 151
- 6
15
votes
1 answer
Orthogonal Projection of $ z $ onto the Affine set $ \left\{ x \mid A x = b \right\} $
Suppose $A$ is fat(number of columns > number of rows) and full row rank. The projection of $z$ onto $\{x\mid Ax = b\}$ is (affine)
$$P(z) = z - A^T(AA^T)^{-1}(Az-b)$$
How to show this?
Note: $A^T(AA^T)^{-1}$ is the pseudo-inverse of $A$…
sleeve chen
- 8,041
- 8
- 43
- 104
14
votes
3 answers
Solve least-squares minimization from overdetermined system with orthonormal constraint
I would like to find the rectangular matrix $X \in \mathbb{R}^{n \times k}$ that solves the following minimization problem:
$$
\mathop{\text{minimize }}_{X \in \mathbb{R}^{n \times k}} \left\| A X - B \right\|_F^2 \quad \text{ subject to } X^T X =…
Alec Jacobson
- 484
- 2
- 13
13
votes
1 answer
How do you solve linear least-squares modulo $2 \pi$?
I have an overdetermined system of $m$ equations ($i = 1, 2, \dots, m$)
$$ \sum_{j=1}^n A_{ij} \, x_j = y_i \pmod{2\pi} $$
where the $x$ coefficients are unknown, and $m > n$.
This is, essentially, the linear least squares problem but on…
XYZT
- 893
- 5
- 21
12
votes
2 answers
simple example of recursive least squares (RLS)
I'm vaguely familiar with recursive least squares algorithms; all the information about them I can find is in the general form with vector parameters and measurements.
Can someone point me towards a very simple example with numerical data, e.g. $y =…
Jason S
- 3,059
- 1
- 20
- 27
12
votes
2 answers
Does gradient descent converge to a minimum-norm solution in least-squares problems?
Consider running gradient descent (GD) on the following optimization problem:
$$\arg\min_{\mathbf x \in \mathbb R^n} \| A\mathbf x-\mathbf b \|_2^2$$
where $\mathbf b$ lies in the column space of $A$, and the columns of $A$ are not linearly…
syeh_106
- 2,968
- 1
- 21
- 33
11
votes
2 answers
why is the least square cost function for linear regression convex
I was looking at Andrew Ng's machine learning course and for linear regression he defined a hypothesis function to be $h(x) = \theta_0 + \theta_1x_1 + ... + \theta_nx_n$, where $x$ is a vector of values
so the goal of linear regression is to find…
demalegabi
- 371
- 3
- 11
11
votes
4 answers
Prove that the system $A^T A x = A^T b$ always has a solution
Prove that the system $$A^T A x = A^T b$$ always has a solution. The matrices and vectors are all real. The matrix $A$ is $m \times n$.
I think it makes sense intuitively but I can't prove it formally.
Lundborg
- 1,594
- 13
- 23
10
votes
5 answers
Proof of convexity of linear least squares
It's well known that linear least squares problems are convex optimization problems. Although this fact is stated in many texts explaining linear least squares I could not find any proof of it. That is, a proof showing that the optimization…
Sanyo Mn
- 419
- 2
- 4
- 10
10
votes
1 answer
Estimating Parameter - What is the qualitative difference between MLE fitting and Least Squares CDF fitting?
Given a parametric pdf $f(x;\lambda)$ and a set of data $\{ x_k \}_{k=1}^n$, here are two ways of formulating a problem of selecting an optimal parameter vector $\lambda^*$ to fit to the data. The first is maximum likelihood estimation (MLE):…
Ian
- 99,158
- 4
- 84
- 149