What is the computational cost of gradient descent vs linear regression?

Question

I know the computational costs for the closed form of linear regression is $O(n^3)$, but I can't find a similar cost comparison to gradient descent.

There are some similar questions here with people "talk" about how gradient descent is more efficient and present some formulas that are not in the form of $O(\cdot)$ and do not include where they got their information.

So to reiterate, I am looking for the computational complexity for gradient descent in the form of $O(\cdot)$, something where $O(\cdot) < O(n^3)$.

It's possible I'm thinking about this wrong and there is no big $O$ comparison. If so please let me know. Thank you.

You can estimate parameters of linear regression using gradient descent, you probably meant OLS https://en.wikipedia.org/wiki/Ordinary_least_squares — Tim, May 12 '19 at 10:05

score 5 · Answer 1 · answered Jan 05 '20 at 02:15

5

The computational cost of gradient descent depends on the number of iterations it takes to converge. But according to the Machine Learning course by Stanford University, the complexity of gradient descent is $O(kn^2)$, so when $n$ is very large is recommended to use gradient descent instead of the closed form of linear regression.

source: https://www.coursera.org/learn/machine-learning/supplement/bjjZW/normal-equation

answered Jan 05 '20 at 02:15

Valentina Sánchez Bermúdez

51
1
2

3

Is it correct to assume that $k$ here is the number of iterations? Otherwise, what's the meaning of $k$? – gc5 Mar 18 '21 at 18:30

score 2 · Answer 2 · answered May 12 '19 at 11:25

2

The number of iterations a gradient method takes to reach a local optimum for a prescribed tolerance is problem dependent: depends on the shape of the surface you are exloring and the initial guess. Hence, no general O() expression for complexity can be given.

answered May 12 '19 at 11:25

F. Tusell

7,733
19
34

Aside from what I said in my answer, where do you get from the $O(n^3)$ complexity for closed form linear regression? Check https://datascience.stackexchange.com/questions/35804/what-is-the-time-complexity-of-linear-regression. – F. Tusell Mar 26 '20 at 18:08
In that case, we can give it in terms of $t$ where $t$ is the number of iterations. as example, $O(tN^5)$ – Zeel B Patel Mar 06 '21 at 01:20

score 0 · Answer 3 · answered Oct 15 '21 at 02:54

0

Gradient descent has a time complexity of O(ndk), where d is the number of features, and n Is the number of rows. So, when d and n and large, it is better to use gradient descent.

answered Oct 15 '21 at 02:54

Asad Dhorajiwala

1

1

and what is `k`? :) – nocibambi Jan 13 '22 at 19:30

What is the computational cost of gradient descent vs linear regression?

3 Answers3