0

I really need a hint here.

Suppose I want to be able to detect unusual events and express the likelihood of it occurring. Suppose that I know that two events usually move in a given association such that a surprising event occurs whenever one strongly deviates from the other.

For example, I know the temperature in my room usually depends on the temperature outside and I can measure both on a daily basis. If I want to trigger an alarm whenever the air conditioner is on and it is unlikely to be a random minimal deviation, how would I define parameters and set up the observation formula?

Sycorax
  • 76,417
  • 20
  • 189
  • 313
Eugene
  • 101
  • This goes into causal modelling. In temperature example, there is a physical process that are based on physical laws and settings. So we would need to know mechanistic view of the world given condition to trigger a detection and express this as mathematical model with parameters. – msuzen Sep 08 '21 at 00:06

2 Answers2

2

Your question is too vague for a definitive answer, so this is really a long comment. Your first paragraph is a very general Q. When should we be surprised? Well, we are surprised when what we see occurring is far from expected! For events, we generally expect to observe events of moderate to large probability, so surprise is anti-monotone with probability. For continuous outcomes, mutatis mutandi with probabiliy density in place of probability. For an discussion see Statistical interpretation of Maximum Entropy Distribution

For statistical models, this translates to log-likelihood (some machine learners seems to have renamed log-likelihood to surprise, quite surprising!). See Maximum Likelihood Estimation (MLE) in layman terms

But probably, just start search this site for or and edit your post to give more details and context!

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
1

The bivariate Normal Distribution has a nice conditional distribution of Y given X. See, for example, this educational reference technical notes.

If this conditional distribution suggests that the value of Y given the correlated X should not, per its expected value and associated sigma, have a very low or high value, and it does, that should be surprising, a possible outlier scenario.

Also, it may imply that the implied regression of Y versus X has been altered from a change in correlation between X and Y, or a change in the associated sigma of Y versus the sigma of X, or a combination of both.

In the real world, it also could imply there are now more than 2 variables at play, where the 3rd variable has largely (but apparently inaccurately) been discounted. This implies a model misspecification issue in regression parlance.

If none of the above events were anticipated or under consideration, then that should indeed be surprising.

AJKOER
  • 1,800
  • 1
  • 9
  • 9