0

I am studying the book "An Introduction to Statistical Learning: with Applications in R", on page 66

While the book explains how to calculate $\beta_0$ and $\beta_1$, it skips how the actual calculation happened and only displays the equations to calculate them and then the result in the next page. I got lost when $\sigma^2$ is calculated. I don't know how it was calculated, as I quote the book:

In general, $\sigma^2$ is not known, but can be estimated from the data. This esti- mate is known as the residual standard error and is given by the formula $\text{RSE} = \sqrt\frac{RSS}{n-2}$

so I calculated $\sigma^2$ as $\text{RSE} = \sqrt\frac{RSS}{n-2}$ which gives 3.258 but it doesn't add up when I try to use this value instead of $\sigma^2$ in the equations (3.8) in the same page.

P.S: This example belongs to the Advertising data set, and it is Sales (Y) as a function of TV (X) advertising. Available here

Ken D
  • 138
  • 1
  • 6
  • 3
    Sloppy writing: It should say "In general, **σ** is not known, but can be estimated from the data. This esti- mate is known as the residual standard error". See also http://stats.stackexchange.com/questions/5135/interpretation-of-rs-lm-output – conjugateprior Sep 08 '14 at 13:11

3 Answers3

2

Looking at ISL's parent book, ESL (Elements of Statistical Learning, Hastie et al, 2009, pp. 44-48), the $N-2$ in the denominator comes from the fact that if there are $p$ variables not including the intercept (so there are $p+1$ variables in toto, usually referred to as $\beta_0, \beta_1,\ldots,\beta_p$) then the unbiased estimate of the variance is: $$ \frac{\sum_{i=1}^N\left(y_i - \hat{y}_i\right)^2}{N - p - 1} $$

In the simple case given by ISL, there is only one non-intercept variable so the numerator remains the RSS and the denominator is $N - 1 - 1 = N - 2$.

Regarding the values, the comment under the question is correct, the writing is a bit misleading. The RSE is an estimate for $\sigma$, not $\sigma^2$. $RSE^2$ is an estimate for $\sigma^2$. Substitute $\frac{RSS}{N-2}$ into the equation for SE$(\hat{\beta_1})^2$ and you will get the values in ISL.

Avraham
  • 3,182
  • 21
  • 40
1

Can't comment yet (not enough reputation), otherwise this would be a comment.

What is alluded to by "In general, σ2 is not known, but can be estimated from the data. This esti- mate is known as the residual standard error" is the following:

Like any other population parameter (e.g., the true mean), the true variance (or standard deviation) within a population of interest is, generally, not known. So, when drawing a finite sample from a population, the variance has to be estimated.

The simplest estimate would be to calculate the observed variance in the sample, and use this as the best estimate of the true variance within the population. As it turns out, however, it can be shown that this naive approach underestimates the true population variance: the sample variance is a biased estimator.

Wikipedia, as always, has more on this: http://en.wikipedia.org/wiki/Variance#Population_variance_and_sample_variance

I suspect that you are confounding the calculation of the unbiased sample variance with the calculation of the residual sum of squares. However, I don't have access to the book. Perhaps you will get a more insightful answer when you quote the relevant part of p. 66, setting up the problem.

coanil
  • 320
  • 2
  • 9
0

This is must be a printing error or a simple mistake- by definition, RSS/(n-2) (without square root sign) is an unbiased estimate of $\sigma^2$

[Strictly speaking, sqrt(RSS/(n-2)) is NOT an unbiased estimate of $\sigma$ - we just use it any way (getting an unbiased estimator of sigma is tricky: see http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation )!]

econ_guy
  • 11
  • 1