2

I have binary count data as a response variable in my logistic regression. The independent variables include, among others, two variables of inclination and orientation measurements, annotated in degrees of arc. For 'orientation' (or aspect), it ranges from 0° to 360°, and for 'inclination' from 0° to 90°. In cases where 'inclination' is 0, the orientation is annotated as '-1', because horizontal surfaces do not face any direction.

For a logistic regression, my workflow would include to use R's scale-function to standardize all continuous variables, among them 'inclination' and 'orientation'. And that is what I did. But does that make sense here? Keep in mind, that an orientation of 0 (north) is the same as 360 (also north), and that 1° and 359° are only two degrees apart.

How can I standardize those measurements? How would you recode an orientation of '-1', which isn't either north nor east, south or west? At this point, both variables appear to be highly influential on my model fit, but can i trust that conclusion?

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Achu Mani
  • 318
  • 3
  • 6
  • Aspect should minimally be represented as sine and cosine. But it is hard to suggest anything other than regarding the values of `-1` as indicating missing. Inclination you could keep as is or use sine or cosine if there is a scientific argument for that. – Nick Cox Nov 05 '15 at 18:06
  • I don't know what R's scale-function does but it's generally nonsensical to treat circular or spherical variables in any but their own terms, for the reason you identify. – Nick Cox Nov 05 '15 at 18:08
  • @Nick Unfortunately, -1 does not mean missing: it means undefined. It is a definite indication of a practically zero slope. A good model would downweight the effect of aspect at low slopes to the extent that the aspects for horizontal surfaces would not make any differences. In effect, (aspect, slope) ought to be considered a point on the unit sphere. The coordinates (-1, 0) correspond to straight up rather than "missing." – whuber Nov 05 '15 at 20:30
  • I don't see the practical distinction from "missing" if `-1` can not be used with its standard numeric meaning and there is nothing different to put in its place. – Nick Cox Nov 05 '15 at 22:16
  • @Nick the default effect of `scale` is to standardize -- the question mentions this but wasn't sufficiently explicit to make the statement unambiguous. That is, it produces $\frac{x_i-\bar{x}}{s}$ where $s$ is the standard deviation of the $x$-values... which is to say, it gives the usual (internal) "*z-score*"[.](https://ismcvc.files.wordpress.com/2014/01/oh-my.jpg) – Glen_b Nov 05 '15 at 22:21
  • @Nick The point is that $-1$ for the aspect is *meaningful* and *definite*. It should not be processed in the same way "missing" values usually are. In particular, it makes no sense to impute it and no sense to throw it away. – whuber Nov 06 '15 at 03:30
  • @whuber I am certainly happy with that way of putting it. – Nick Cox Nov 06 '15 at 08:51
  • @whuber: Any Idea how I could aggregate both values into one? I think what you mentioned makes a lot of sense, both values should be seen as a single one on a sphere, and '-1' could then be included meaningfully. But do you have any clue how I could express that? Maybe as a vector, normal to the surface? – Achu Mani Nov 06 '15 at 10:03
  • @Nick: Thank you, I corrected the typo and If I understood you correctly, I should either transform the values into sine or cosine or leave them as they were, right? Can you explain or link to a source, as to why sine or cosine would be more meaningful than the arc degrees? – Achu Mani Nov 06 '15 at 10:05
  • An angle between 0 and 359 is utterly unsuitable as a predictor; it is only the angle between 0 and 90 that might be OK, but much depends on the science. The only possible exception is if the angles span only a small fraction of the circle and do not cross zero. It's a standard technique to use sinusoids here. Here is one tutorial review http://www.stata-journal.com/sjpdf.html?articlenum=st0116 – Nick Cox Nov 06 '15 at 10:12
  • Note further that scaling degrees by (value $-$ mean) / SD completely destroys the angular information in a degree of arc predictor. – Nick Cox Nov 06 '15 at 11:24
  • @Nick: Yes, I understood that. So for my tilt angles, which don't cross 0 and are limited to one quarter of a circle, I wont scale nor transform, and just leave them as they are. I read through the paper you linked me, but dont fully understand how to transform my orientation values to sine or cosine. Do I have to pick one (sine or cosine) and just transform all angles to those values, or do I need to transform one part of the angles (e.g., those under 180° to either sine or cosine and the other half differently)? I am not well educated in trigonomtr and was hoping to get clearer instructions. – Achu Mani Nov 06 '15 at 11:46
  • The key point is that you need (at least) both sine and cosine of all angles between 0 and 360 deg; that's the way to ensure that your predictions are a periodic function of angle. It's really the main point of the paper developed at length. Sorry if you think I am unclear, but explaining elementary trigonometry is more than I want to do and if the paper is pitched in the wrong way for you I don't think I can do better. There is a book on circular statistics in R, but my recollection is that it does not provide the introduction you are asking for. – Nick Cox Nov 06 '15 at 11:57
  • You have many options, because practically any coordinate system or projection of the sphere that is continuous throughout the upper hemisphere (which excludes angles of longitude, as noted by @NickCox) will do the job--mathematically. Some choices will work better than others in terms of creating approximate linear relationships with the response. Often a theoretical understanding of this relationship can guide the choice. Are you able to disclose the nature of your response variable? – whuber Nov 06 '15 at 13:11
  • 1
    Yes, we study the influence of a variety of factors on the presence of absence of photovoltaic energy generating modules (i.e., solar cells) on domestic roof tops. Those roofs are all characterized by inclination, orientation, surface area and so on. Those characteristics are part of the regression to predict further diffusion of domestic solar energy production. We expect that a south facing (orientation between 90 and 270 degrees) roof is more likely to host a solar cell (at least in the northern hemisphere), and want to make that variable part of our regression. – Achu Mani Nov 06 '15 at 13:55

0 Answers0