3

I am working out the difference between two angles from a circle, and I work out the mean difference across 96 trials in 10 separate samples.

In order to detect outliers for statistical analysis, Barnett & Lewis (Outliers in Statistical Data, 1984) suggest the use of a von Mises basic model (at section 7.1).

  1. Is a von Mises distribution appropriate in my case? I'm not interested in the raw angle values per se, but the difference between them.

  2. I understand how outliers are calculated from z-scores, standard deviations from the mean, etc., but I don't understand how how they are calculated from von Mises – can anybody offer simple clarification?

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357
  • What *is* your case? You describe that you are dealing with angles but there is no information what these angles are generated, and there's no information that can help as guidance to determine what sort of distribution is appropriate. (Your question is a bit like *"I am working out a difference. Is a t-distribution appropriate?"* The information about the number of samples and the information that it is an angle does not help to determine what sort of distribution is useful.) – Sextus Empiricus Dec 29 '20 at 23:18
  • The von Mises distribution is some sort of Gaussian distribution equivalent for a circle (maximum entropy, equation for diffusion, CLT approximation for sum of many small variables, that sort of stuff). But just like the Gaussian distribution is not appropriate for every problem, the same might be true for the Von Mises distribution and your problem with angles. – Sextus Empiricus Dec 29 '20 at 23:21

1 Answers1

0

I don't quite understand what you're doing, and I had to look at Wikipedia to remind myself about von Mises, so take my answers with a grain of salt. If you could create an analogous problem with just real numbers instead of angles on a circle, would it make sense to you? von Mises is like a Gaussian on the circle. Are the angle differences all actually small enough to approximate the distribution as a Gaussian?

  1. Angle differences also live within 0 to $2\pi$, so von Mises could be appropriate. But only if it is! Just like points on the real axis may or may not be described by a Gaussian.

  2. You could calculate the "circular variance" and then set the $\kappa$ parameter to match using $<cos \delta \theta> = I_1(\kappa)/I_0(\kappa)$. Then detect outliers that have small values of the von Mises distribution, recalculate $\kappa$, and repeat. Assuming zero mean here. Basically do the analogue of what you'd do with a sample drawn from the real numbers.

mfardal
  • 186
  • 4
  • 2
    Although *oriented* angle differences are in $[0,2\pi),$ the differences themselves are defined only up to sign and therefore are in $[0,\pi].$ That, I believe, is the crux of the matter. Where you write "then detect outliers" you are begging the question of *how* to detect them. This is what Barnett & Lewis write about, but their account is directed at detecting outliers of the angles rather than their differences. – whuber Mar 20 '20 at 14:17
  • @whuber In my experience the difference is best defined this way. calculate clockwise difference (sign say $+$) and anti(counter)clockwise difference $-$ and use the smaller of the two in absolute value. Thus (using degrees only for convenience) the difference between $45$ and $90$ is the smaller in absolute value of $45$ and $-315$, so $45$. The only indeterminacy here is whether the difference between two opposite directions is assigned $180$ or $-180$. In practice the difference is of interest mostly whenever two sets of directions should be similar and that doesn't bite hard. – Nick Cox Aug 29 '20 at 09:58
  • What differences have to do with detecting outliers I do not know. – Nick Cox Aug 29 '20 at 09:59
  • @Nick You describe the (usual) distance along the circle. An important unknown in this thread is whether the OP is computing actual angular differences or just their absolute values. I would presume the former, for otherwise we don't really have a problem of circular statistics, but rather one where the data are constrained within known bounds. – whuber Aug 29 '20 at 13:36