How can I formulate and determine the overall probability that the consecutive data are overlapping temporally?

Question

Suppose we have the data $A, B, C, D, E,$ and $F$ that are expected to arrive at the destination at $t_A, t_B, t_C,t_D,t_E,$ and $t_F$, respectively. However, the channel in which these data propagate induces randomness such that the data arrive at random times $t'_A, t'_B, t'_C,t'_D,t'_E,$ and $t'_F$, respectively. $T$ is some constant time separation to avoided overlapping and ideally all data should arrive within $t_i$ and $t_i+T$. But, $T$ cannot be too large as it can degrade performance (smaller $T$ is preferred). As an example, $C$ is arriving early and overlaps temporally (that is, in time) with $B$, $D$ is arriving late and overlaps temporally with $E$, and $F$ is arriving early and overlaps temporally with $E$.

I want to know the probability that two consecutive data overlap temporally with one another (illustrated by the darker regions). In other words, the time at which two consecutive data arrive is less than $\tau\ (\tau<T)$, which is the duration of a data and is the same for all data.

Let the random arrival time of a data follow the Normal distribution, such that $t'_i \sim N(\mu_i,\sigma^2_i)$, where $i \in (A,B,C,D,E,F)$ and $\mu_i=t_i$.

Then, from here, that probability is

$$P(\text{Two consecutive data are overlapping})=P(Z<\tau)=P\left(\frac{Z-\mu_Z}{\sigma_Z}<\frac{\tau-\mu_Z}{\sigma_Z}\right) \\ =\Phi\left(\frac{\tau-\mu_Z}{\sigma_Z}\right)\ (1)$$

where $Z=t_j-t_k, j \neq k$ and $j \in (F,E,D,C,B), k \in(E,D,C,B,A)$.

Eq. (1) allows me to find the probability that two data overlap.

How can I formulate and determine the overall probability that the consecutive data are overlapping temporally?

I may be mistaken, but what I gather is that the overall probability is $P(A \text{ and } B \text{ are overlapping})$ and $P(B \text{ and } C \text{ are overlapping})$ and $P(C \text{ and } D \text{ are overlapping})$ and $P(D \text{ and } E \text{ are overlapping})$ and $P(E \text{ and } F \text{ are overlapping})$.

How can I proceed further?

Additionally, since $E$ is overlapping temporally with $D$, we consider them to be destroyed. Then $F$ will not be overlapping temporally with $E$. So, how can we incorporate this condition in the overall probability?

Thank you in advance.

Events are frequently modeled with a Poisson distribution. Is there some reason why you are using a normal distribution of event times instead? — EdM, Aug 31 '20 at 17:04
@EdM Yes, that is correct. However, the events described here are those pertaining to the reception of data in a communication channel. Each data experience a random delay before arriving at the destination, which for simplicity, is considered to be a Normal distributed random variable. To keep the post concise, I omitted that description as even without that, the problem at hand remain unchanged. Unless, I am mistaken! — nashynash, Aug 31 '20 at 17:13
If you use a normal distribution you will necessarily have some negative random delays, as the distribution covers the entire real line. Does that make sense? — EdM, Aug 31 '20 at 17:40
I think your situation is clear enough, but the question is unclear because "the consecutive events are overlapping" can have several interpretations. Could you elaborate on what this means, or give some examples where it holds and doesn't? And is there any essential difference between the problem with 5 overlaps compared to just 3, which would be simpler to ask, illustrate, and answer? — whuber, Aug 31 '20 at 17:55
I'm wondering what I'm missing that you can't just search for the overlap, but need a probability? Is the start and end t not exactly known but estimated? — Josh, Aug 31 '20 at 20:17
@EdM Thank you. I have included additional information to make it clearer. — nashynash, Sep 01 '20 at 02:32
@whuber Thank you. I have included more explanations to clarify the confusion. Concretely, we have three overlaps. And the second and third overlaps are a little tricky as I have explained in the revised texts. — nashynash, Sep 01 '20 at 02:35
Is the following correct for the first part of your question: For a given fixed mean difference of ideal packet inter-arrival times $T$, with each packet having a normal distribution with variance $\sigma^2$ around its ideal arrival time, and a fixed packet width $\tau$, you want to know the probability that any 2 consecutive packets overlap in time? — EdM, Sep 01 '20 at 11:47

score 2 · Accepted Answer · answered Sep 01 '20 at 15:41

The situation is a set of data packets, each of fixed width $\tau$ in time, that ideally start at times $T,2T,3T,...$. The start time of each packet, however, is normally distributed around its ideal starting time, with variance $\sigma^2$.

In the terminology of the question, $Z$ represents the actual difference in start times between 2 consecutive packets.* So by construction, $\mu_Z = T$. If the packet arrival times are independent (except for their defined ideal arrival times), the variance of the difference in arrival times, $\sigma_Z^2$, is $2\sigma^2$. So the probability of 2 consecutive events overlapping can be put a bit more directly as:**

$$\Phi\left(\frac{\tau-T}{\sqrt2\sigma}\right)$$

For concreteness, if you wanted this probability to be 1% or less, you would need approximately $\left(\frac{\tau-T}{\sqrt2\sigma}\right) < -2.326,$ or $T> \tau +3.29 \sigma$.

How can I formulate and determine the overall probability that the consecutive data are overlapping temporally?

If by this you mean the probability that none of the packets overlap versus at least one pair overlapping, the usual interest in a case like this, then you don't want to use the "and" operator with respect to the individual probabilities of overlap, as you do in the question:

what I gather is that the overall probability is $P(A \text{ and } B \text{ are overlapping})$ and $P(B \text{ and } C \text{ are overlapping})$ and $P(C \text{ and } D \text{ are overlapping})$ and $P(D \text{ and } E \text{ are overlapping})$ and $P(E \text{ and } F \text{ are overlapping})$.

That would be close to the probability that all of the packets overlap. (The assumption that the second of an overlapping set of packets is destroyed and thus doesn't overlap with the next packet complicates things a bit.)

If you want to know the probability that all packets were received correctly without overlap, you want to use the "and" operator on the individual probabilities of non-overlap. For each potential overlap, the probability of non-overlap is

$$1- \Phi\left(\frac{\tau-T}{\sqrt2\sigma}\right).$$

Then use the "and" operator on these probabilities of non-overlap. So for non-overlap of 3 packets (2 possible overlaps) you have the square of this probability, for 4 packets the cube, etc. Your example is for 6 packets, with 5 potential overlaps.

Once you have thus determined the probability that no packets overlapped, the probability that some of the packets overlapped (which I think is what this question is getting at) is 1 minus that probability of no overlap.

This type of moving back and forth between probabilities of events and their complements often helps to simplify analysis of problems like this.

*The question currently shows $Z=t_j-t_k$ where the $t_i$ represent the ideal arrival times. Based on the context, I take that to be a typo, with the intent being for $Z$ to represent the actual difference in arrival time, $Z=t_j'-t_k'$.

**One potentially useful trick would be to re-define the time scale in terms of $\sigma$. In particular, if you let one unit of time equal $\sqrt2\sigma$ then this would be just $\Phi\left(\tau-T\right)$. Some find working in such dimensionless units to be simpler.

@nashynash I simply assumed that all packets had the same variance $\sigma^2$ about their ideal arrival times, so that there was no need to distinguish $\sigma_j^2$ from $\sigma_k^2$. I would have used that assumption as a start to get an idea of what was going on, even if I thought that I would later have to deal with the possibility of different variances. If you want to allow for that possibility, use the corresponding $\sqrt{\sigma_j^2+\sigma_k^2}$ in the denominator for each of the consecutive overlap calculations. — EdM, Sep 01 '20 at 17:44
Once again, thank you very much. Truly appreciate your time and efforts for the answer. — nashynash, Sep 01 '20 at 17:59
@nashynash not quite. You don't need the $\sqrt 2$ in the denominator of the argument to $\Phi()$. The denominator if all variances are the same is $\sqrt{\sigma^2 +\sigma^2}=\sqrt{2\sigma^2}=\sqrt2 \sigma$. With different variances, the equivalent is $\sqrt{\sigma_j^2 + \sigma_{j+1}^2}$. Second, although I think your intent is clear there might be some ambiguity in the way you wrote the product term. I would have put further parentheses around all that comes after the product symbol to avoid any ambiguity. — EdM, Sep 02 '20 at 02:41

How can I formulate and determine the overall probability that the consecutive data are overlapping temporally?

1 Answers1