6

I have a problem where I want to measure deviations from zero degrees. This outcome variable is a circular measure, since a deviation of -180 degrees is equivalent to 180 degrees.

However, I don't want to complicate my model (using a linear mixed effects model) by using circular statistics, so I was wondering if I can use the absolute deviation expressed as a percentage of 180?

So for instance, -180 and 180 degrees would both give 100% deviation, while 90 and -90 would both give 50% deviation. Is this a legitimate fix? Namely, would their be any caveats I need to be aware of by 'linear-izing' my circular outcome measure?

**Edit: To give more context into my problem. I am looking to predict peaks in a real-time signal. I make a prediction and see how far off I was to the closest peak (0 degrees corresponds to peaks, while 180 degrees corresponds to antipeaks). I'm interested in computing how 'accurate' my different prediction strategies are, so I was considering just looking at deviations from zero degrees. I'm not sure whether this is a completely circular problem in the first place, since an outcome measure of is bounded between 180 and -180 degrees.

Abundance
  • 125
  • 8
  • 1
    What range of angles are you observing? If the range is small you could treat the effect as already linear. If you're observing the full $360$ (or close to this) then yes I agree, something clever is needed – jcken Sep 16 '21 at 13:55
  • The full 360 degrees ... – Abundance Sep 16 '21 at 14:14
  • I think that using a min/max scaler is a simplification, but a common one. – Ciaran Haines Sep 17 '21 at 04:28
  • 5
    The answer is a clear no: there's no continuous way of re-expressing the outcome on a linear scale. That's why there exists a theory of circular statistics in the first place. – whuber Sep 18 '21 at 14:53
  • 1
    I'm curious as to why this approach wouldn't work, it gets rid of the circular nature of the metric – Abundance Sep 18 '21 at 15:10
  • 4
    @Abundance the circle wraps around itselve. Random movements to the right and left on the circle may not end up on the right or left. If you make three quarter steps to the left you end up one quarter step to the right. Only if you have a process that does not wrap around the circle or when the effect can be considered negligible, then you can linearize the situation without much problems (that's why you got the question whether the entire circle has observations, sometimes you the observations cluster on only a part of the circle). – Sextus Empiricus Sep 18 '21 at 17:47
  • I guess what I'm asking is if my outcome measures are in the range (-180, 180], so you can't have something like -270 degrees, hence no wrapping, and the outcome that I'm interested in is symmetrical about zero degrees, whether I can linearize my variable. For the purposes of my problem, an angle like 179 and -179 are equivalent, since they both deviate from zero equally. So although I'm measuring angles, I'm not sure if they are circular in the standard sense. – Abundance Sep 18 '21 at 19:39
  • 2
    @Abundance, your *output* is in the range -180 to 180. But that doesn't necessarily mean that the underlying process doesn't generate -270. It is just that for the output the -270 is mapped to +90, and that is why you get output values that are only -180 to 180. Your linear model will be wrong if you consider all the cases of +90 in the output as +90 in the underlying process. It might also be partly -270. – Sextus Empiricus Sep 19 '21 at 11:42
  • So the output range, if it is restricted to the range -180 to 180, doesn't tell us much about the process and whether it is circular or not. But... if all your observed data points would cluster within a small range only, ie they never reach the other side of the circle, then the range of the data is an indication that the underlying process doesn't cause wrapping due to a sum of deflection adding up to going around the entire circle. – Sextus Empiricus Sep 19 '21 at 11:47
  • Re the edit: although the outcome measurement is of circular type, you describe a *distance* on the circle. *Distances are not circular.* (For one thing, they have an obvious lower bound of zero.) In this sense, you don't have a question. They can, however, be richly varied. For a brief account see https://stats.stackexchange.com/a/201864/919. – whuber Sep 21 '21 at 16:32
  • 1
    *"I'm not sure whether this is a completely circular problem in the first place, since an outcome measure of is bounded between 180 and -180 degrees."* It is not about the outcome being bounded. It is about the underlying mechanism. A clock has 60 minutes but that doesn't mean that a time estimate can not make errors larger 30 minutes (it means that you have some sort of censoring and an error of 45 minutes becomes an error of 15 minutes). What you need to consider is whether your prediction strategy can make an error larger than 180 degrees (but get's projected to a value smaller than 180). – Sextus Empiricus Sep 21 '21 at 20:22
  • @SextusEmpiricus, yes, I think that is the main tension. Since my prediction considers the closest peak, it's not possible to get an error larger than 180 that is projected backwards, so it wouldn't be circular in the first place. – Abundance Sep 22 '21 at 17:33

3 Answers3

5

You cannot validly linearize a circular measure which spans 360°, assuming the circularity of that measure is valid.

Any transformation which "linearizes" a circular measure must necessarily privilege some value as being maximally linearly distant from some other value by virtue of lying on the other side of whatever point the transformation uses as its point of either "unwinding" or "flattening" the circle. This maximal linear distance will be a fiction created entirely as an artifact of the transformation, and will not exist in the original circular measure. The same holds true in a continuous circular measure of degrees, radians, etc. In fact, by carefully choosing the privileged point in your transformation function, you could probably fabricate any relationship you wanted between outcome and predictor, by ensuring that certain linearized values become either the largest, smallest, or middlemost.

Modular measurements—whether discrete or continuous—have important characteristics which have no representation in linear forms. This is why treating the modular numbers adorning the face of a clock make no sense as truly (linear) natural numbers. For a simple example, as we actually read the 12 hour clock, $1 - 12 = 1$, $3 - 9 = 9 - 3 = 6$, etc. But linearizing the clock's hours into integers would mean that $1 - 12 = -11$, and that $3 - 9 \ne 9 - 3$.

Alexis
  • 26,219
  • 5
  • 78
  • 131
  • You can have problems in which case a point is privileged. For instance a starting point or a control. – Sextus Empiricus Sep 21 '21 at 20:32
  • @SextusEmpiricus I think I see your point. Can you provide an example of such privilege "assuming the [360°] circularity of that measure is valid"? I understand that one can have an actually linear measure that for some reason got represented as a circular variable, and you want to undo that, but assuming the circularity of the original measure is actually valid where would such a privilege emerge? – Alexis Sep 21 '21 at 21:36
  • An example is the OP's where he is looking to the difference with respect to his prediction. The point of his prediction is the privileged point which can be set at zero. -- I agree you can still not linearize when there is circular arithmetics that plays a role. But this may not be the situation when the deflections are only small (such that you never reach beyond the other side of the circle). So there are two reasons why linearizing can be bad 1) circular arithmetics 2) no necessary privileged zero. However, you cannot say 'you cannot (ever) linearize'. It *depends* on those two points. – Sextus Empiricus Sep 22 '21 at 07:14
  • @SextusEmpiricus Yeah, I am down with that. :) That said, I think the assumption my answer opens with captures this. Your answer was clear about your "it depends approach," and mine is addressing the question as one of validly circular measure → linear measure. – Alexis Sep 22 '21 at 17:32
2

I don't want to complicate my model (using a linear mixed effects model) by using circular statistics, so I was wondering if I can use the absolute deviation expressed as a percentage of 180?

...Is this a legitimate fix?

There is not sufficient information in order to tell whether this is legitimate or not.

The problem with circular systems is that they wrap around themselves. For instance it you make three quarter turns to the left then you end up one quarter turn to the right.

So a large random step/movement/change/deviation/effect (whatever you want to call it) might end up as being measured/observed as a small step and in the opposite direction. What you observe as a single quarter step might in reality be three quarter steps.

If you treat the circle as a linear variable then you will not be taking this into account and you will wrongly interpret the values.


If the nature of your data is such that you do not get this effect of revolving/wrappingaround the circle. That is, if your changes are small enough that you will not see values, or only negligible few values, that make a deviation of more than a half circle, then you can use a linearized variable.

You speak about 'using the absolute value only' and you want to ignore the direction of change. It is unclear why you (need to) do this. Depending on the task and data that you have you might choose to do this. It is not wrong in principle and it does occur. However, in order to be able to say whether it is good for your case, the details about the case need to be know.

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • I'm accepting this as the answer as it best addressed the situation I had. The real question was whether the metric I had was circular or not, and it appears that my metric was not. The proposed linearization was correct only because the problem wasn't circular to begin with. – Abundance Sep 22 '21 at 17:40
1

Since you are interested in deviation from 0 (and not the direction), it would be appropriate to use $|\theta|$ as your variable.

You've defined the problem in a way such that $-90$ and $+90$ (and similarly, $-2$ and $+2$) are the same outcome so one can take the absolute value and replace the circular problem with a linear scale going from $0$ to $180$.

Your solution is equivalent to what I describe but rescaling (dividing by $1.8$) to go from $0$ to $100$ instead.

Circular statistics are vital when we care about position on a circle and need to account for the ends of the line wrapping around. In your problem instead of connecting the ends of the line together we are folding the line in half (not just matching $-180$ to $180$, but matching every $-\theta$ to $\theta$).

Adam Kells
  • 908
  • 1
  • 12
  • 4
    If the deviation is intended to be used as an explanatory variable, this can work. But if the deviation is a *response* variable, this destroys its essential circular nature and likely leads to a ruinously bad model. Thus, it is important to qualify your advice ("you can use" and "is fine") by explaining its limitations and scope of application. – whuber Sep 20 '21 at 14:03
  • I'm not sure that I agree. If we want our response variable to be the angle then there's no way to map to a linear scale since any mapping will break the essential circular statistics (linking -180 to 180). But if we only care about distance from a particular point on the circle, we then have defined a linear problem. Now from our problem definition, not only is $-180=180$ but $-2=2$ as well. – Adam Kells Sep 20 '21 at 14:57
  • 3
    This is *not* a "linear problem" when the response is circular: even when you care only about the distance and the response is circular, then a response of (say) 170 is either the same as another response of 170 or it's 20 degrees from it. This crucial distinction is lost when you replace the angle by its absolute value. – whuber Sep 20 '21 at 16:13
  • 2
    Agreed. I may have misinterpreted the original question as not caring about this distinction. We can treat the problem as linear if we don’t care about this, but I agree that since we don’t have the full problem specification this can’t be assumed. – Adam Kells Sep 20 '21 at 17:18