12

I observe processing times of a process before and after a change in order to find out, if the process has improved by the change. The process has improved, if the processing time is reduced. The processing time's distribution is fat tailed, so comparing based on average is not sensible. Instead I'd like to know if the probability to observe a lower processing time after the change is significantly above 50%.

Let $X$ be the random variable for the processing time after the change and $Y$ the one before. If $P(X < Y)$ is significantly above $0.5$ then I'd say the process has improved.

Now I have $n$ observations $x_i$ of $X$ and $m$ observations $y_j$ of $Y$. The observed probability of $P(X < Y)$ is $\hat p = \frac{1}{n m} \sum_i \sum_j 1_{x_i < y_j}$.

What can I say about $P(X < Y)$ given the observations $x_i$ and $y_j$?

Christian
  • 133
  • 3

3 Answers3

13

@jbowman provides a (nice) standard solution to the problem of estimating $\theta=P(X<Y)$ which is known as Stress-strength model.

Another nonparametric alternative was proposed in Baklizi and Eidous (2006) for the case where $X$ and $Y$ are independent. This is described below.

By definition we have that

$$\theta=P(X<Y)=\int_{-\infty}^{\infty}F_X(y)f_Y(y)dy,$$

where $F_X$ is the CDF of $X$ and $f_Y$ is the density of $Y$. Then, using the samples of $X$ and $Y$ we can obtain kernel estimators of $F_X$ and $f_Y$ and consequently and estimator of $\theta$

$$\hat\theta=\int_{-\infty}^{\infty}\hat F_X(y)\hat f_Y(y)dy.$$

This is implemented in the following R code using a Gaussian kernel.

# Optimal bandwidth
h = function(x){
n = length(x)
return((4*sqrt(var(x))^5/(3*n))^(1/5))
}

# Kernel estimators of the density and the distribution
kg = function(x,data){
hb = h(data)
k = r = length(x)
for(i in 1:k) r[i] = mean(dnorm((x[i]-data)/hb))/hb
return(r )
} 

KG = function(x,data){
hb = h(data)
k = r = length(x)
for(i in 1:k) r[i] = mean(pnorm((x[i]-data)/hb))
return(r )
} 

# Baklizi and Eidous (2006) estimator
nonpest = function(dat1B,dat2B){
return( as.numeric(integrate(function(x) KG(x,dat1B)*kg(x,dat2B),-Inf,Inf)$value))  
}

# Example when X and Y are Cauchy
datx = rcauchy(100,0,1)
daty =  rcauchy(100,0,1)

nonpest(datx,daty)

In order to obtain a confidence interval for $\theta$ you can get a bootstrap sample of this estimator as follows.

# bootstrap
B=1000
p = rep(0,B)

for(j in 1:B){
dat1 =  sample(datx,length(datx),replace=T)
dat2 =  sample(daty,length(daty),replace=T)
p[j] = nonpest(dat1,dat2)
}

# histogram of the bootstrap sample
hist(p)

# A confidence interval (quantile type)
c(quantile(p,0.025),quantile(p,0.975))

Other sorts of bootstrap intervals might be considered as well.

12

Your estimate $\hat{p}$ is equal to the Mann-Whitney $U$ statistic divided by $mn$ (thanks, Glen!), and is therefore equivalent to the Wilcoxon rank-sum statistic $W$ (also known as the Wilcoxon-Mann-Whitney statistic): $W = U + {n(n+1)\over{2}}$, where $n$ is the sample size of $y$ (assuming no ties.) You can therefore use tables / software of the Wilcoxon test and transform them back to $U$ to get a confidence interval or a $p$-value.

Let $m$ be the sample size of $x$, $N$ = $m+n$. Then, asymptotically,

$W^* = \frac{W-\frac{m(N+1)}{2}}{\sqrt{\frac{mn(N+1)}{12}}} \sim \text{N}(0,1)$

Source: Hollander and Wolfe, Nonparametric Statistical Methods, roughly p. 117, but probably most nonparametric statistics books will get you there.

jbowman
  • 31,550
  • 8
  • 54
  • 107
  • @Glen_b - thanks, I've updated the answer. Very generous guess you made there about the cause of the mistake! – jbowman Oct 17 '19 at 15:29
0

Consider the paired difference $X_i-Y_i$, $P(X_i-Y_i<0) = p$ then $I\{X_i-Y_i<0\}$ for $i=1,2,..,n$ are iid Bernoulli random variables. So the number $X$ of $X_i < Y_i$ is binomial $n$ $p=P(X_i-Y_i<0)$. Then $X/n$ is an unbiased estimate of the probability and confidence intervals and hypothesis tests can be done base on the binomial.

Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
  • 2
    What is the basis of the pairing, Michael? – whuber Jun 09 '12 at 16:45
  • The OP said "Let X be the random variable for the processing time after the change and Y the one before" So Xi is after the intervention and Yi is before. – Michael R. Chernick Jun 10 '12 at 04:35
  • Did you notice that the counts (potentially) differ? You appear to assume $m=n$. My reading is that a "process" is *temporal* and that the $X_i$ sample it before an event and the $Y_j$ sample it after an event. – whuber Jun 10 '12 at 13:34
  • 1
    You're right. I guess some sort of two sample test such as the Wilcoxon as suggested by jbowman above would be appropriate. It is interesting that the Mann-Whitney form og the test counts the number of Xis < the Yjs. – Michael R. Chernick Jun 10 '12 at 14:13