Generating a confidence interval for the difference in standard deviation between two populations

Question

Background

I've made a device which sizes potatoes, and one component in that device comes with a known error with respect to mean and standard deviation. I want to know whether my device - which combines that component with several other components - introduces any new errors, or whether its errors are within the bounds of the known errors.

Definitions

I have two independent variables, $X \sim \mathcal{N}(\mu_x, \sigma^2_x)$ and $Y \sim \mathcal{N}(\mu_y, \sigma^2_y)$.
I take unpaired samples from $X$ and $Y$ of size $n_x$ and $n_y$ respectively.
Let $\bar{x}$ be the sample mean and $s_x$ be the sample standard deviation for a given sample from $X$, and likewise for $\bar{y}$ and $s_y$.
Let $\delta_\mu = |\mu_x - \mu_y|$ and $\delta_\sigma = |\sigma_x - \sigma_y|$.

Deriving a Confidence Interval for the Difference in Means

Using a formula from Welch's t-test, we can derive a confidence interval for $\delta_\mu$:

$$ \text{CI} = |\bar{x} - \bar{y}| \pm t\sqrt{\frac{s_x^2}{n_x}+\frac{s_y^2}{n_y}} $$

where $t$ is $t$-score for the sample size and level of significance in question.

Deriving a Confidence Interval for the Difference in Variances

I know that, if $X \sim \mathcal{N}(\bar{x}, s^2)$, then, taking samples of size $n$, $\text{var}(\bar{x}) = \frac{s^2}{n}$ and $\text{var}(s) = \frac{2s^4}{n}$. Making the appropriate substitutions to our previous CI, we can derive a confidence interval for $\delta_{\sigma^2}$, the difference in population variances:

$$ \text{CI} = |s_x^2 - s_y^2| \pm t\sqrt{\frac{2s_x^4}{n_x} + \frac{2s_y^4}{n_y}} $$

Deriving a Confidence Interval for the Difference in Standard Deviations

At this point it's tempting to conclude that the confidence interval for the difference in standard deviations is simply the square root of the confidence interval for the difference in variances. But that isn't true; $\delta_\sigma \neq \sqrt{\delta_{\sigma^2}}$. To illustrate this point with an example, suppose we took large samples from $X$ and $Y$, and found that $s_x = 10$ and $s_y = 5$; then $s_x - s_y = 5$, whereas $\sqrt{s_x^2 - s_y^2} = \sqrt{100 - 25} \approx 8.66$.

I don't know where to go from here. I'm fairly sure that the confidence interval that I'm looking for will start off $\text{CI} = |s_x - s_y|$... But, to fill in the rest, I would need a formula for the variance of the standard deviation, which I'm not able to derive, nor can I find.

EDIT: A potential solution?

According to another answer on this site, given a sufficiently large $n$ as well as some other assumptions, the sample standard deviation will follow distribution $S \sim \mathcal{N}(\sigma, \frac{\sigma^2}{2n})$. If that's true then the confidence interval would look like:

$$ \text{CI} = |s_x - s_y| \pm t\sqrt{\frac{s_x^2}{2n_x} + \frac{s_y^2}{2n_y}} $$

Is that true? It looks right, and, in the particular case I'm trying to solve, the numbers seem correct, but I'd be grateful if someone could say, 'Yeah, that looks okay,' or, 'No, that's nonsense.'

1) Why difference of st.dev.? Ratio is often more natural ... 2) look into profile likelihood, see tag [tag:profile-likelihood] — kjetil b halvorsen, Nov 17 '20 at 10:57
@kjetilbhalvorsen I've made a device which sizes potatoes, and one component in that device comes with a known error with respect to mean and standard deviation. I want to know whether my device - which combines that component with several other components - introduces any new errors, or whether its errors are within the bounds of the known errors. — Tom Hosker, Nov 17 '20 at 15:26
Can you please add that information as an edit to the post? Not everybody reads comments! I will try to compose an answer ... — kjetil b halvorsen, Nov 17 '20 at 15:31
@kjetilbhalvorsen I'll do that now. Also, please see the edit I've just finished, in which I outline a potential solution. — Tom Hosker, Nov 17 '20 at 15:40
Your solution could work. However, except for very large values of both $n_x$ and $n_y,$ you shouldn't use a Student t distribution to determine $t.$ You likely need a longer tailed distribution. The theory isn't really worth working out because the result is extremely sensitive to the assumed normality of $X$ and $Y.$ Bootstrapping might provide some guidance here, provided neither of $n_x$ and $n_y$ are too small. — whuber, Nov 17 '20 at 16:33
@whuber Thank you. In my particular case, $n_x$ and $n_y$ are both over 5000, and always will be. (The point of the device is to size a lot of potatoes very quickly!) Would that sample size be enough to soften some of your reservations? — Tom Hosker, Nov 17 '20 at 16:47
Definitely--especially because the sizes of the potatoes have plausible upper bounds. — whuber, Nov 17 '20 at 16:48
I do not really understand bullet point 3 in **Definitions**. — kjetil b halvorsen, Nov 17 '20 at 16:52
@kjetilbhalvorsen I've just re-phrased it. Tell me if it's still confusing. — Tom Hosker, Nov 17 '20 at 17:11