4

So I've been watching Andrew Ng's machine learning lectures, and I'm on a video about univariate linear regression. He was talking about how a Hypothesis takes an input and predicts an output, like a typical function we learn in math class such as $f(x) = mx + b$, where f is a function with input x that outputs a line with a slope of m and a y-intercept of b. However, Ng said that the general hypothesis equation in linear regression is $h_\theta(x) = \theta_o + \theta_1 x$. I get that this is a function h of input x, and it looks like $\theta_1 x$ is equivalent to $mx$ while $\theta_o$ is equivalent to $b$, but why use all the thetas instead of the other variables? Is $\theta_1$ a slope like $m$? Why use theta multiple times? What is the meaning of the subscripts?

Thanks!

Jodast
  • 155
  • 5

1 Answers1

5

$\theta$ is a common variable in statistics. We usually see $\theta$ as an angle in trig and physics long before we see its use in statistics, but $\theta$ is just the variable of choice in statistics for an unknown parameter.

$$\theta_0 = b$$

$$\theta_1 = m$$

The interpretations of the intercept and slope parameters are different, hence the different subscripts.

The reason there is a $\theta$ subscript on $h_{\theta}$ is because $\theta$ without a subscript is a set of all $(\theta_0,\theta_1) \in \mathbb{R}^2$. (Did he, by any chance, use a capital theta, $\Theta$?) What this means is that the equation is a valid regression equation for any values of $\theta_0$ and $\theta_1$. This is for technical reasons when it comes to hypothesis testing.

Dave
  • 28,473
  • 4
  • 52
  • 104