Gradient descend algorithm ascending for learning rate

Question

This is not an Homework assignment. I am trying to implement the gradient descend explained in the link (Page 5) [http://cs229.stanford.edu/notes/cs229-notes1.pdf][1].I am able to successfully run the code and getting optimal coefficients suggested by closed form solution of simple regression but the problem is that it's taking 400000 iterations to achieve the desired solution. And if i increase my learning rate from 0.00003 to 0.0003, it becomes an gradient ascend algorithm i.e. loss value explodes to infinity. How do i reduce these large number of iterations and still get the closed form solution ? And how one should set learning rate for their gradient descend algorithm ? Any good useful advice would be much appreciated.

import pandas as pd
import numpy as np

   #  Data processing code 
advertising_data = pd.read_csv("http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv", index_col=0)
target = np.array(advertising_data.Sales.values)
advertising_data["ones"] = np.ones(200)
advertising_data = advertising_data[["ones", "TV"]]
features = np.array(advertising_data.values)

# Gradient Descend implementation here 

def error_ols(target, features):
    def h(betas):
        error = target - np.dot(features, betas)
        return error
    return h

def ols_loss(errors):
    return np.sum(errors*errors)

def gradient_descend(initial_guess, learning_step, gradient, iterations = 400000):
    for i in range(0, iterations):
        update = initial_guess + learning_step*gradient( initial_guess)
        initial_guess = update
        error = error_ols(target, features)(update)
        print ols_loss(error)

    return update

def ols_gradient(target, features):
    def h(betas):
        error =  target - np.dot(features, betas)
        return np.dot(error, features)/200
    return h



gradient_function = ols_gradient(target, features)
initial_guess = np.array([1,1])
print gradient_descend(initial_guess, 0.00007, gradient_function)

score 2 · Accepted Answer · answered Jul 16 '15 at 18:40

It really depends on how precise you would like your answers to be. Gradient descent is iterative and converges towards the exact solution, but may never hit it exactly. We can change the function you made for gradient descent to take into account the acceptable threshold that is close enough to the actual minimum of the cost function if you prefer to stop the iterations in that way instead:

def gradient_descend(initial_guess, learning_step, gradient, threshold = 0.0000001):
    update = initial_guess + learning_step*gradient( initial_guess)
    initial_guess = update
    error = error_ols(target, features)(update)
    change = 200000000
    while(change > threshold):
        update = initial_guess + learning_step*gradient( initial_guess)
        initial_guess = update
        prevError = error
        error = error_ols(target, features)(update)
        change = abs(ols_loss(error) - ols_loss(prevError))
        print ols_loss(error)

    return update

If you want to use gradient descent, the other thing you could do is have a different starting point. For instance, if you initially guess that TV has no effect (slope of 0) and the intercept is the mean of Sales, that converges faster (specifically for a threshold of 0.00000000001 and a learning rate of 0.0000003 that is over 300,000 fewer iterations).

In reality, if you don't want so many iterations, you can solve this problem in a non-iterative way with Least Squares. Ng also talks about that later in that document you attached.

Many thanks for the help. One last thing, how do i get an idea regarding what learning rate should i choose ? A learning rate of 0.0000003 is highly non-intuitive. For this particular case, since i knew it's a well-formed convex functions, hence the minima exists and hence i keep lowering the learning rate until i started to observe convergence. what about other non-trivial optimization of loss functions ? — lovekesh, Jul 17 '15 at 02:59

Gradient descend algorithm ascending for learning rate

1 Answers1

Linked