2

If $X$ is distributed according to a normal distribution with zero-mean $\mathcal{N}(0, \sigma_N^2)$, $Y:=\vert X\vert$ is said to be distributed according to a half-normal distribution, cf. 2.

I am trying to come up with an analytic expression for the distribution of sample-variances of a half-normal distribution in dependency of the sample size $n$ and the variance of the underlying normal distribution $\sigma_N^2$ (why? cf. background):

According to 1, sample-variances $S_N^2$ of a normal distribution are distributed as: $$\tag{1}\label{1} \frac{(n-1)S_N^2}{\sigma_N^2} \sim \chi^2(n-1) \qquad \text{for } S_N^2 := \frac{1}{n-1}\sum_{i=1}^n(X_i-\overline{X})^2. $$

In order to model the distribution of sample variances $S_H^2$ of the half normal, I made use of the following relationship between the normal variance $\sigma_N^2$ and the variance of the associated half-normal $\sigma_H^2$, which is given by 2: $$ \sigma_H^2 = \sigma_N^2\left(1-\frac{2}{\pi}\right). $$ Combining this with \eqref{1} made me think that $S^2_H$ could indeed be distributed as $$\tag{2}\label{2} \frac{(n-1)S_H^2}{\sigma_N^2\left(1-\frac{2}{\pi}\right)} \sim \chi^2(n-1) \qquad \text{for } S_H^2 := \frac{1}{n-1}\sum_{i=1}^n(Y_i-\overline{Y})^2. $$

However, using python experiments (below), I could verify relationship \eqref{1} for $S^2_N$, but not \eqref{2} for $S^2_H$.

Can anyone explain to me why this is happening and what I am missing?

Code

I wrote the following script to verify the relationship.

import numpy as np
import scipy.stats
import matplotlib.pyplot as plt

sigma_N = 2.5
n = 100
N = 100000

def run_experiments(sample_fctn, scale, title):
    # experimental pdf (N simulations)
    X = sample_fctn()
    vars_exp = np.var(X, ddof=1, axis=1)
    pdf_exp, bins = np.histogram(vars_exp * scale, bins=101, density=True)
    # predicted pdf
    pdf_pred = scipy.stats.chi2.pdf(bins[1:], n-1)
    plt.plot(bins[1:], pdf_pred, label='predicted')
    plt.plot(bins[1:], pdf_exp, label='experimental')
    plt.legend()
    plt.title(title)
    plt.show()
    
# normal case
normal_scale = (n-1) / sigma_N**2
generate_normal_samples = lambda: sigma_N * np.random.randn(N, n)
run_experiments(generate_normal_samples, normal_scale, "Distribution of Variances (Normal)")

# half-normal case
half_normal_scale = (n-1) / ( sigma_N**2 * (1 - 2/np.pi))
generate_half_normal_samples = lambda: np.abs(sigma_N * np.random.randn(N, n))
run_experiments(generate_half_normal_samples, half_normal_scale, "Distribution of Variances (Half-Normal)")

Experimental Results

check
  • 71
  • 5
  • Ran the experiments with `half_normal_scale = (n - 1) / std_true**2` and results look completely off. – check Jul 29 '21 at 08:00
  • Yes indeed : I misread the question. $S^2_n$ does converge to $\sigma^2_H$ hence to $\sigma^2_N(1-2/\pi)$. But there is no reason for $S^2_n$ to be a scaled $\chi^2$ variable. Actually, in this problem, you could consider $\sum_{i=1}^n X_i^2$ to estimate $\sigma_N^2$ or $\sigma^2_H$ and this variate is scaled $\chi^2$. – Xi'an Jul 29 '21 at 08:12
  • Thanks a lot for the input! As noted in the linked background, my goal is to calculate the probability that a measured *half-normal* variance $S_H^2$ belongs to (another) half-normal distribution $\mathcal{H}$ (which is fully described by the variance of the underlying normal $\sigma_N^2$). To do this, I need to know the distribution of the sample variances drawn from $\mathcal{H}$. Note that the sample-mean of a *half-normal* distribution is necessarily strictly larger than zero. – check Jul 29 '21 at 08:34
  • 2
    Just notice that the distribution of $\sum_i X_i^2$ is the same for Normal and Half-Normal and that this is a sufficient statistic in both cases. The fact that the 1/2-Normal mean is larger than zero does not matter. – Xi'an Jul 29 '21 at 12:28
  • Tried it and of course you are right! As you expected, the statistic $\frac{n-1}{\sigma_N^2}\sum_i X_i^2$ is equal for both the half-normal and normal. Unfortunately it does not seem to be described entirely accurately by $\chi^2(n-1)$. Nevertheless thanks for the hint! I will see if I can go further from there. – check Jul 29 '21 at 14:37
  • 2
    It is $\chi^2_n$! – Xi'an Jul 29 '21 at 15:05
  • Works like a charm, thank you so much! I have added the resulting test to [the place this question originated](https://stats.stackexchange.com/a/536298/310483). Nevertheless I would be interested in my original question concerning the standard deviation, since this would allow me to test if published data could be half-normally distributed. – check Jul 30 '21 at 12:53

0 Answers0