-1

Suppose $n = 5$, $s = 0.0771$.

The question is: what's the $P(\sigma < 7.63\%)$?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user9292
  • 1,324
  • 2
  • 17
  • 31
  • $\sigma$ is the population standard deviation, while $s$ is the sample standard deviation. – user9292 Dec 23 '12 at 01:46
  • Suppose we're dealing with a set of data that is normally distributed with $s$ = 0.0771. The population standard deviation is unknown, $\sigma$. Now suppose we want to find $P(\sigma <7.64\%)$. – user9292 Dec 23 '12 at 01:52
  • 1
    The distribution of the standard deviation or variance usually depends on distribution of the original data. – Christopher Aden Dec 23 '12 at 01:53
  • 2
    It would probably be better to write $\sigma \lt 0.0763$ or $0.0764$ rather than using percentages – Henry Dec 23 '12 at 02:07
  • Do we *know* the population mean, or do we *estimate* it? In both cases, what is its value? – Stephan Kolassa Dec 23 '12 at 07:37

1 Answers1

5

Is this question motivated by a concrete problem?

You can’t compute that if you don’t know anything on the distribution of the population. If it is normal, it is known that $s^2$ is drawn from a ${\sigma^2\over n-1}\chi^2(n-1)$.

Hence, interpreting your question as “what is $\def\P{\mathbb P} \P(S^2 \ge 0.0771^2)$ if $\sigma^2 = 0.0763^2$”, as is commonly done in the context of confidence intervals, $$\begin{aligned} \P(S^2 \ge 0.0771^2) &= \P\left( 4 {S^2 \over \sigma^2} > 4 {0.0771^2 \over 0.0763^2}\right)\\ & =\P( X > 4.084 ) \end{aligned}$$ with $X \sim \chi^2(4)$. This is pchisq(4.084, df=4, lower.tail=FALSE) which is 39.4%.

Note on the interpretation of the question In a previous version of this answer, I made (and signaled) the following abuse of notation: $\P(\sigma^2 < 0.0763^2) = \P\left( \sigma^2 < {0.0763^2 \over 0.0771^2} S^2 \right)$, and then continued as above.

This is totally nonsensical in the frequentist setting, as (as stated by whuber in the comments) $\sigma$ is not a random variable, but a number. Hence, strictly speaking, $\P(\sigma^2 < 0.0763^2)$ is either 0 or 1, and one don’t know which. I think it would have been unfair to ask for a reformulation of the question, as the above reformulation is quite usual and linked to the pivot method for confidence interval. I think the question this would make sens in Fisher fiducial inference however I am not sure wether fiducial inference make sense at all. A rigorous answer could be made in the bayesian framework, assuming that the parameter $\sigma^2$ is drawn from some prior distribution.

Fiducial inference: distribution on $\sigma^2$ I may not gain friends with what follows... This is one of the cases where Fisher advocated that one can obtain for the unknown variance $\sigma^2$ a fiducial probability distribution, with no need of prior distribution, as follows. $$\begin{aligned} F(x) = \P(\sigma^2 < x) &= \P\left(X > 4 { 0.0771^2\over x^2} \right)\\ &= 1 - \P\left( X < {0.023778 \over x} \right) \end{aligned}$$ where $X \sim \chi^2(4)$. Of course this is for $x>0$, if $x \le 0$ one let $F(x) = 0$. Then $F$ is continuous and increases from 0 to 1 on the real line; it is the cdf of the fiducial probability distribution of $\sigma^2$. Here is a graph of $F$ and its derivative $f$ (note the two different scales). fiducial distribution

I know fiducial inference has flaws and is not much in favor anymore. However for simple questions like this one, which are very frequent from non-statisticians, it allows to give satisfactory answers, and matches well the intuition of non-math-statisticians about confidence intervals and regions.

Elvis
  • 11,870
  • 36
  • 56
  • This is distribution dependent: if the underlying distribution had a uniform distribution then the answer would have been closer to $46\%$ – Henry Dec 23 '12 at 10:52
  • Yes, I think I stated this clearly enough, but anyway it’s good to show that the answer really changes. – Elvis Dec 23 '12 at 11:37
  • I am puzzled by this answer: the assumption explicitly is that $\sigma$ is a parameter of a distribution. That means, among other things, that it is a *number,* not a random variable. Thus the probability statements in this answer appear to be without any meaning at all. Yes, we all know that they can be given a meaning when we assume--explicitly--a *prior* distribution for $\sigma$: but such an assumption is not in evidence here. In technical terms: this answer appears to confuse $\Pr(\sigma | s)$ with $\Pr(s | \sigma)$. – whuber Dec 23 '12 at 15:11
  • OK, I’ll rewrite that to avoid these abuse of notation, which is however common when considering confidence intervals. – Elvis Dec 23 '12 at 15:15
  • @whuber, do you like it better this way?! – Elvis Dec 23 '12 at 15:22
  • It's not as puzzling anymore, but it's hard to see how it answers the question (except indirectly by pointing out that the question is either nonsensical or needs to stipulate a prior for $\sigma$). This "abuse of notation" is addressed in (literally) hundreds of threads here, such as http://stats.stackexchange.com/questions/26450: the bottom line is that it is such a serious abuse that it makes people question anyone who employs it. – whuber Dec 23 '12 at 16:08
  • I think Fisher would have accepted it this way, though... – Elvis Dec 23 '12 at 17:01