7

A measuring rod has length $u$ (for "unit") and a long object has length $x$. Suppose $u$ is laid end to end $i$ times and $x$ is laid end to end $j$ times. We want to observe whether $iu\ \left\{\begin{array}{c} < \\ = \\ > \end{array}\right\}\ jx$. But for each iteration of $u$ and of $x$ there is a random error---say we observe whether $$iu + \varepsilon_1+\cdots+\varepsilon_i\ \left\{\begin{array}{c} < \\ > \end{array}\right\}\ jx+\delta_1+\cdots+\delta_j$$ for $i=1,\ldots,I$ and $j=1,\ldots,J$, where the $\varepsilon$s are independent and $\sim N(0,\sigma^2)$ and the $\delta$ are independent of each other and of the $\varepsilon$s and $\sim N(0,\tau^2)$. So we have $IJ$ observations, each binary, equal to either "$<$" or "$>$" (encode them with $0$s and $1$s if you like).

What is known about statistical inference about the ratio $x/u$ in this problem? Things like the MLE for $x/u$ or the MLEs for $\sigma$ and $\tau$, or confidence intervals for $x/u$, etc. For large values of $I$ and $J$, might one be able to look for things like non-normality of the distributions of $\delta$ and $\varepsilon$?

Michael Hardy
  • 7,094
  • 1
  • 20
  • 38
  • If you want to make statements about the ratio $x/u$ it would of course be much better to have the measurements $iu+\epsilon_1,iu+\epsilon_2,\ldots,jx+\delta_j$ rather than just binary variables (as the latter only tell you that one is bigger than the other and not _how much bigger_ it is). With some of the parameters known, binary data might be sufficient though. Am I correct to think that $u$ is known? What about $\sigma^2$ and $\tau^2$? – MånsT Jul 11 '12 at 07:11
  • @MånsT : We can't take $u$ to be known because $u$ is the unit of measurement by means of which all lengths are to be known. I'm not sure it would make any difference of $u$ were in some sense known in terms of other units. – Michael Hardy Jul 11 '12 at 19:45
  • 1
    +1 This is a probit model with parameters $(u,x)$ and *dependent* observations determined by a multivariate normal distribution (whose covariance matrix depends on the two nuisance parameters $\sigma$ and $\tau$ but is otherwise completely determined). The integrals involved in the likelihood look formidable, suggesting a Bayesian MCMC approach would be the most practical line of attack. – whuber Jul 11 '12 at 21:21
  • I think for certain values of the data there's a non-unique MLE. For example, suppose $I=J=2$ and it is observed that: $\begin{align} x+\text{error} & > u+\text{error} \\ 2x+\text{error} & > u+\text{error} \\ 2x+\text{error} & > 2u + \text{error} \\ x+\text{error} & < 2u+\text{error}\end{align}$. Then, if I'm not mistaken, the maximum likelihood would be obtained at every point where $\hat\sigma=\hat\tau=0$ and $1 < \widehat{x/u} < 2$. But if we get inequalities that would be logically inconsistent if not for the error terms, then I suspect the MLE would be unique. – Michael Hardy Jul 13 '12 at 19:38

1 Answers1

1

Assuming the only known thing for a combination i,j is whether it is a 1 (iu+e is bigger) or a 0 (otherwise);

Suppose you put the observations in a matrix (I,J) (top left element is 1,1 bottom left is I,1)

Here is the idea that I have. Unfortunately I cannot prove it mathematically, but it might give you some direction.

If you take the biggest possible square submatrix from the top left and calculate the average, it may be a decent estimator of x/u.

Furthermore the variance will be higher if the 'changeover' from 0 to 1 does not show a nice pattern and/or if you often see that there is no changeover (example 000110111). However, i do not even dare to suggest an estimator for the variance.