0

Let $X$ be a matrix of size $m \times n$.

  1. This link from another Cross Validated thread says if X is centred (column means are subtracted), then covariance matrix $C = \frac{1}{n-1} X^T X$.

  2. In the machine learning lecture notes from the class, I have $C = \frac{1}{n-1} X X^T$, if $X$ is centred. enter image description here enter image description here Since the results of $C$ are different one is size of $m \times m$ and the other is size of $n \times n$. Which one should I follow? What are the differences?

(the context is the application of PCA)

Logan
  • 252
  • 3
  • 11
  • Do you have a reference for the machine learning lecture notes? – Greenparker Jun 03 '17 at 07:14
  • Reference added. @Greenparker – Logan Jun 03 '17 at 07:28
  • 1
    They are equivalent expressions. All that matters is whether you arrange your variables as rows or columns in your dataset. – StatsStudent Jun 03 '17 at 07:33
  • 2
    So to clarify, go back in your course notes and determine whether you are arranging your $X$ matrix by having observations (e.g. subjects, people, stocks, companies, etc.) in rows and variables (e.g. answer to Q1 on a survey, age, gender, etc.) as columns or vice versa. You'll note the difference is due to the definition of how $X$ is arranged. – StatsStudent Jun 03 '17 at 07:53
  • There is no such thing as "the covariance of a matrix". If X is the design matrix of a linear fit, then C is the covariance of the fit parameters. – David Wright Jun 03 '17 at 07:55
  • 1
    If you were to delete the single occurrence of the phrase "The covariance" in this text, it would be just as clear and mean exactly the same thing. In other words, it doesn't matter what the authors chose to call $C$--what matters is how it was constructed and the role it plays in understanding SVD. – whuber Jun 03 '17 at 15:03

0 Answers0