0

Say we are trying to model a simple marketing funnel: $F_1 \rightarrow F_2 \rightarrow F_3 \rightarrow F_4$. Let's say we are asked to estimate how many people we expect at the end of the funnel.

We are provided:

  • All observed conversion rates $F_{i+1} / F_i$
  • How many people start at the beginning of the funnel in $F_1$

We can assume all quantities above refer to some $\Delta_t$ (e.g. 1 day).

My question

I understand I can intuitively compute $F_4$ as:

$F_4 = F_1 \left(F_2/F_1\right)\left(F_3/F_2\right)\left(F_4/F_3\right) \tag{1}$

However, I'd like to know what actual statistical models and assumptions yield Eq. 1, if we start by modeling $F_4$ as an expectation of some random variable $x$, and that each level of the funnel acts independently.


For example:

  1. Can the above be understood as a "chain of binomial distributions"? (one for each level in the funnel).
  2. What does the distribution of a chain of binomials look like, and how would it yield Eq. 1?
  3. Would such chain of binomials necessarily yield a $x$ that is Poisson distributed?
Josh
  • 3,408
  • 4
  • 22
  • 46
  • See [here](https://stats.stackexchange.com/questions/92736/what-could-be-a-statistical-test-for-comparing-funnel-data-before-and-after-impo) for 1 and 2. – dimitriy Jul 26 '20 at 23:29
  • You can't make any real progress without having the actual counts, because the standard errors in the rates depend on them. – whuber Jul 27 '20 at 14:11
  • Thanks @whuber so when people use formula in Eq. 1 (_"20% make it from F1 to F2, 30% make it from F2 to F3, etc. so let's multiply the rates"_), what assumptions are they making in terms of the underlying statistical process generating the data, and what risks are they incurring? This is a fairly common formula for any type of funnel or chain. I think it's a chain of binomial distributions, where taking expectations gets us Eq.1 but I am not sure. – Josh Jul 27 '20 at 14:13
  • That's a separate set of questions. My comment can be illustrated by pointing out that one in five customers going from F1 to F2 is information that differs substantially from 20,000 out of 100,000 going from F1 to F2, even though both have the same ratio. – whuber Jul 27 '20 at 14:15
  • @whuber - Totally, although my question is what type of model allows people to use Eq. 1 and still get a meaningful estimate, however poor it is. I assume there is a model that people are implicitly assuming when they use this type of estimator. I suppose it's a chain of binomials, although I can't find any references for it even though these types of back-of-the-envelope rate multiplications are fairly ubiquitous, and as you said, the estimator could have high variance and bias. – Josh Jul 27 '20 at 14:21
  • The key word is "Markov." – whuber Jul 27 '20 at 14:23

0 Answers0