0

I am interested in using the squared log-difference between predictions and observations as a loss function, it appears to me to be a very natural way of dealing with errors where these are expected to scale proportionally with the independent variable and the domain covers multiple orders of magnitude.

This is required in my case since an X% 'miss' of the model prediction should be penalised equally, whether the prediction is small or large (unlike with SSE).

However, it seems that the trade-off of using log residuals is that (at least in the naive implementation e.g.):

$ \mathcal{L}(Y,\hat{Y}) = \ln \left(\frac{Y}{\hat{Y}}\right)^2 $

the loss function is not symmetrical, since relative underestimation is penalised more than relative overestimation (e.g. $ \mathcal{L}(10,9) > \mathcal{L}(10,11) $ ).

My question is, is there a common transformation that can used to skew the profile of the loss function to be symmetrical again, without losing the required behaviour of treating relative errors equally independent of the size of $Y$?

If there is no such transformation, is there any reason not to construct a sort of average log relative error such as:

$ \mathcal{L}_{symmetrical}(Y,\hat{Y}) = \left[ \ln \left(1+ \frac{|Y-\hat{Y}|}{Y} \right) + \ln \left(1-\frac{|Y-\hat{Y}|}{Y} \right) \right]^2 $

which appears to correctly treat overestimates equal to underestimates and not penalise larger independent variables?

Zac
  • 240
  • 1
  • 10
  • There exists a rich set of choices you can make: see https://stats.stackexchange.com/a/201864/919 for an account. Note, please, that "symmetrical" itself means a host of different possibilities. See https://stats.stackexchange.com/a/29010/919 for a general definition. – whuber Sep 28 '21 at 23:08
  • 1
    Thank you very much Whuber, I appreciate the links and these have helped me to some other examples. However I think you may have (understandably) answered the title rather than the question detail, which is more concerned with the existence of a transformation of the log-relative error to a symmetric loss function - and specifically to confirm if the example I provided is suitable or flawed. As for 'symmetrical', I take your point (but this is also described for this context in the question, as relative underestimation being penalised equally to relative overestimation). – Zac Sep 29 '21 at 20:14
  • Thank you for the clarification. There are myriad such transformations possible. (Proof: let $f$ be any function of the positive real numbers and set $\mathcal{L}(Y,\hat Y)=f(Y/\hat Y)+f(\hat Y/Y).$ When this symmetrized version of $f$ is strictly increasing, it determines such a transformation.) Thus, a better way to research this issue would begin with considering how you wish to use the loss function and what effects different loss functions might have on your results. – whuber Sep 30 '21 at 13:22
  • Thanks Whuber, I see. On second thought, what about just relative error $\frac{|Y-\hat{Y}|}{Y}$ ? Seems to fit the requirements unless I'm missing something (also Y strictly > 0 in my use-case) – Zac Oct 17 '21 at 22:29

0 Answers0