11

In a paper I've written I model the random variables $X+Y$ and $X-Y$ rather than $X$ and $Y$ to effectively remove the problems that arise when $X$ and $Y$ are highly correlated and have equal variance (as they are in my application). The referees want me to give a reference. I could easily prove it, but being an application journal they prefer a reference to a simple mathematical derivation.

Does anyone have any suggestions for a suitable reference? I thought there was something in Tukey's EDA book (1977) on sums and differences but I can't find it.

Rob Hyndman
  • 51,928
  • 23
  • 126
  • 178
  • Wikipedia has a reference to a textbook at http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient#Removing_correlation ; not sure that helps... – shabbychef Jul 27 '11 at 03:42
  • 4
    And the prove indeed is more than trivial with equal variances :( $Cov(X+Y,X-Y) = E((X-\mu_X)+(Y-\mu_Y))((X-\mu_X)-(Y-\mu_Y)) = Var X - Var Y = 0$... Good luck, Rob. – Dmitrij Celov Jul 27 '11 at 10:25
  • May be indeed to consider the 2 variable case as the separate case of multivariate rotation? – Dmitrij Celov Jul 27 '11 at 10:37
  • 2
    Tukey doesn't *prove* anything in EDA: he proceeds by example. For an example of looking at $y+x$ versus $y-x$ see Exhibit 3 of chapter 14, p. 473 (the discussion begins on p. 470). – whuber Jul 27 '11 at 13:32
  • 1
    One alternative way to get around having to provide a reference. You can consider it a case of modelling the principal components of your data $X,Y$, rather than the individual variables themselves. That would be an easy thing to provide a reference for – probabilityislogic Jul 29 '11 at 23:13
  • 2
    Closely related: [What is the intuition behind the independence of $X_2 - X_1$ and $X_2 + X_1$, $X_i\sim\mathcal N(0,1)$?](http://stats.stackexchange.com/q/112692/7290) – gung - Reinstate Monica Aug 21 '14 at 15:36

1 Answers1

3

I would refer to Seber GAF (1977) Linear regression analysis. Wiley, New York. Theorem 1.4.

This says $\text{cov}(AX, BY) = A \text{cov}(X,Y) B'$.

Take $A$ = (1 1) and $B$ = (1 -1) and $X$ = $Y$ = vector with your X and Y.

Note that, to have $\text{cov}(X+Y, X-Y) \approx 0$, it's critical that X and Y have the similar variances. If $\text{var}(X) \gg \text{var}(Y)$, $\text{cov}(X+Y, X-Y)$ will be large.

Karl
  • 5,957
  • 18
  • 34
  • 1
    For $W$ and $Z$ to be uncorrelated (or nearly uncorrelated), we don't need $\operatorname{cov}(W,Z)$ to be $0$ or nearly $0$: we need the Pearson correlation coefficient $\rho_{W,Z}$ to be $0$ or nearly $0$. – Dilip Sarwate Aug 22 '14 at 15:52