3

This is an obfuscated version of a real problem:

Each day I speak with some number of (distinct) girls. I compute the Jaccard similarity index between two consecutive days: $$ J(t)=J(t,t-1)=\text{Intersection}(\text{girls}(t),\text{girls}(t-1))/\text{Union}(\text{girls}(t),\text{girls}(t-1)) $$ I would like to compare $J(t+n+1)$ to $J(t) \dots J(t+n)$ to see if it is statistically anomalous. For this I need to assume some probability model that describes the distribution of daily J(t). The distribution support should be bounded on [0,1]. We may further assume that the distribution is unimodal, falling off at the bounds, but staying strictly positive. What would be a reasonable probability distribution to use? Preferably this should be derived from a reasonable underlying model describing the daily change in distinct girls.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
o17t H1H' S'k
  • 511
  • 6
  • 11

0 Answers0