1

When I have a linear regression and I want to determine uncertainty in the slope from the quality of the fit (ignoring any uncertainty from error bars for now), I generally use

$$ \sigma_m = m \sqrt{\frac{1/R^2 - 1}{n-2}} $$

where $R^2$ is the coefficient of determination, $n$ is the number of data points, $m$ is the slope, and $\sigma_m$ is the uncertainty in the slope.

For a set of data that is highly non-linear, and thus has a very low-quality fit, $R^2$ may become negative. However, when $R^2 \leq 0$, the argument of the square root becomes negative, and thus the uncertainties become imaginary. Is there a method for determining uncertainty due to the quality of a fit under these circumstances?

Bunji
  • 145
  • 1
  • 9
  • 1
    Can you provide a reference for the formula you are using? Typically, standard errors, which seem to be what you are trying to calculate are always positive. – dc3726 Jun 03 '20 at 22:30

2 Answers2

2

When there is an intercept, in-sample $R^2 \ge 0$, so there is no risk of an imaginary root. Even $R^2=0$ is so unlikely in real (or even simulated) data that I would consider it to be practically impossible. You can use your equation without fear of an imaginary root or dividing by zero.

If I had an awful fit, however, I would be skeptical of any inference. What does it mean to look at the slope coefficient when the data follow a sine curve?

Dave
  • 28,473
  • 4
  • 52
  • 104
0

from wikipedia

If you substitute above function to the numerator of your $\sigma_m$, we get $\frac{SS_{total}}{SS_{total}-SS_{res}}$. The uncertainty seems to depend on which direction SS of residual differs from total SS and n-2 is just a standardized term. I think you can add an absolute calculation outside the numerator to make it non-negative without changing the meaning this formula wants to express.

Tbone
  • 79
  • 2