4

I'm trying to use an inverse hyperbolic sine transformation to reduce the effect of outliers in my target variable. Unfortunately, I don't appear to have access to the basic papers on it. I've found the formulation but am not sure how to estimate the theta parameter for it. Does anyone know?

Thanks

tomas
  • 1,715
  • 4
  • 20
  • 26
  • Please clarify the question: what is theta? The inverse hyperbolic sine function is $asinh(z)=\log(z+\sqrt{z^2+1})$ - no theta in sight. – Aniko Apr 12 '12 at 18:45
  • 1
    A useful followup thread (which includes a more accurate description of the IHS that is valid for negative $z$ as well as positive $z$) is at http://stats.stackexchange.com/questions/157532. – whuber Jun 18 '15 at 11:44

1 Answers1

8

The basic idea is as follows,

You have the IHS transformation

$$z_j = g_j(y_j;\theta)= \operatorname{sinh}^{-1}(\theta y_j)/\theta,\,\,j=1,...,n.$$

Then you have to find the value of $\theta$ that maximises the concentrated log-likelihood

$$L(\theta) = -\dfrac{n}{2}\log[g(\theta)^TMg(\theta)] - \dfrac{1}{2}\sum_j\log(1+\theta^2 y_j^2),$$

where $g(\theta)=(g_1(y_1;\theta),...,g_n(y_n;\theta))$, $M = I - X(X^TX)^{-1}X^T,$ and $X$ is the matrix of explanatory variables.

I hope this helps.

Ref: Alternative Transformations to Handle Extreme Values of the Dependent Variable

Author(s): John B. Burbidge, Lonnie Magee, A. Leslie Robb

Source: Journal of the American Statistical Association, Vol. 83, No. 401 (Mar., 1988), pp. 123-127x

  • yeah this is what i was looking for thanks. i unfortunately don't have access to jstor. just to double check, that first term, its log absolute value correct? – tomas Apr 12 '12 at 19:58
  • No, it is a square bracket [], but the term inside the log is a quadratic form then it is always positive. –  Apr 12 '12 at 20:05
  • I have corrected the $sinh^{-1}$ instead of $sinh$. –  Apr 12 '12 at 20:10