5

Consider the standard simple linear regression model: $$ Y_i = \beta_0 + \beta_1 X_i + \epsilon_i, $$ for $i=1,\dots,n$. In matrix-vector form this is $$ \mathbf{Y} = \mathbf{X_n}\beta + \epsilon, $$ where, in particular, $$ \mathbf{X_n} = \begin{bmatrix} 1 & X_1, \\ 1 & X_2, \\ \vdots & \vdots \\ 1 & X_n \end{bmatrix}. $$

In the case of simple linear regression, the determinant of the matrix $\mathbf{X_n}^T\mathbf{X_n}$ is $$ \text{Det}(\mathbf{X_n}^T\mathbf{X_n}) = n \sum_{i=1}^n(X_i - \overline X_n)^2, $$ where $\overline X_n = \frac{1}{n} \sum_{i=1}^n X_i$.

Now consider multiple regression with $p$ regressors, in which case: $$ \mathbf{X_n} = \begin{bmatrix} 1 & X_{11} & X_{12} & \dots & X_{1p} \\ 1 & X_{21} & X_{22} & \dots & X_{2p} \\ \vdots & \vdots & \vdots & \ddots \vdots \\ 1 & X_{n1} & X_{n2} & \dots & X_{np} \end{bmatrix}. $$

Each rows is an iid observation of the (population) covariates. While each row corresponds to an iid observation of the covariates, the random variables in each row can be dependent on one another.

So for the matrix $\mathbf{X_n}$ in the case of multiple regression:

  1. Is there a known formula for the determinant of $\text{Det}(\mathbf{X_n}^T\mathbf{X_n})$? Of course, the general formula for the determinant of $(p+1)\times (p+1)$ matrix is one such formula, but I am wondering if there is something 'nicer'? The formula for simple linear regression is really nice because it is just a sum over $(X_i - \overline X_n)^2$, is there something analogous for the case of multiple regression? Or at the very least is there something nicer than the general determinant formula?
  2. Is $\text{Det}(\mathbf{X_n}^T\mathbf{X_n})$ guaranteed to be positive like it was in the case of simple linear regression?
  3. Note that in the case of simple linear regression $$\lim_{n \to \infty}\frac{1}{n^{2}}E[\text{Det}(\mathbf{X_n}^T\mathbf{X_n})] = \lim_{n \to \infty}\frac{1}{n} \sum_{i=1}^n(X_i - \overline X_n)^2 = \lim_{n \to \infty}\frac{n-1}{n}\sigma^2 \to \sigma^2.$$ Now, for the multiple regression case, what does $\frac{1}{n^{p+1}}E[\text{Det}(\mathbf{X_n}^T\mathbf{X_n})]$ converge to as $n \to \infty$? Does it convergerge to some population statistic like the simple regression case? Note, we need the scaling to prevent it blowing up, see the comment by jld below.
sonicboom
  • 780
  • 1
  • 6
  • 13
  • 1
    $det(X^T X)$ is non negative, but will be zero iff there exists a linear dependence among columns of $X$ so not guaranteed to be *positive* – jcken Mar 22 '21 at 18:17
  • In (3), what are you assuming about the distribution of the $X_n$? @Erik How are you computing the determinants of non-square matrices?? – whuber Mar 22 '21 at 18:37
  • Oh yes, good point. – Eric Perkerson Mar 22 '21 at 18:37
  • 1
    @Erik I wondered, because there *are* methods to construct determinants from rectangular matrices. I describe one at https://stats.stackexchange.com/a/512862/919. – whuber Mar 22 '21 at 18:39
  • @whuber A matrix times its transpose is a square matrix. I am interested in the classical regression setting, the elements of the random vectors $X_i = [1, X_{i1},\dots,X_{ip}]$ are i.i.d. continuous random variables. I am not sure if we need additional conditions for the expectation of the determinant, and its limit, to exist in the multiple regression case? – sonicboom Mar 22 '21 at 18:55
  • 1
    Yes, you do need conditions. The expectation of the determinant depends on the specifics of the process that creates a sequence of rows of $X.$ (My earlier comment referred to a now-deleted comment wherein another user referred to the determinants of $X$ and $X^\prime,$ btw). – whuber Mar 22 '21 at 19:10
  • Under reasonable regularity conditions $\frac 1n X_n^TX_n \to_p \Sigma$ with $\Sigma$ being the covariance matrix of the new observations, so, since $\det$ is continuous, $\det \frac 1n X^T_nX_n \to_p \det \Sigma$. Then $\det X_n^TX_n = n^p \det \frac 1n X_n^T X_n$ so this will generally blow up – jld Mar 22 '21 at 19:17
  • @jld What does it converge to in expectation if we scale it appropriately, i.e. what does $\frac{1}{n^p} E[\text{det} X_n^T X_n]$ converge to? I will edit the post to put that scaling in. – sonicboom Mar 22 '21 at 20:02
  • @whuber Each row is a vector of covariate observations of iid random variables. E.g. the $n$ rows in the second column correspond to $n$ observations of some random variable. So – sonicboom Mar 22 '21 at 20:07
  • While each row corresponds to an iid observation of the covariates, the random variables in a given row can be dependent on one another. – sonicboom Mar 22 '21 at 20:13

1 Answers1

1

Just the 2) since I am not professor: $\mathbf X^T \mathbf X$ is positive definite where $\mathsf {rk}(\mathbf X)=max(n,p+1)$ or maximum number of independent rows which leads the determinant you ask is positive since it is a product of $p+1$ eigenvalues of $\mathbf X^T \mathbf X$.

It's more fair to write $\mathbf X_{n \times {p+1}}$ because this is the shape of the matrix.

PS: Statisticians usually write transposed matrix using the $\mathbf X'$.

Good Luck
  • 293
  • 15
  • There no need to write matrix as bold, but since I followed you. – Good Luck Mar 22 '21 at 23:07
  • 1
    This answer is incorrect: $X^\prime X$ is not guaranteed to be positive definite. One or more of the eigenvalues can be zero. – whuber Mar 28 '21 at 16:40
  • As I [checked](https://math.stackexchange.com/questions/4080863/condition-on-x-so-that-xtx-is-positive-definite/4080866?noredirect=1#comment8432023_4080866) the condition $\mathsf {rk}(\mathbf X)= p+1$ is sufficient for the covariance matrix to have all positive eigenvalues. – Good Luck Mar 28 '21 at 20:51
  • Although that statement is correct, it does not fix what you wrote: "the determinant you ask is positive since it is a product of p+1 eigenvalues of $X^\prime X.$" That is erroneous because it implicitly assumes $X^\prime X$ is positive definite. – whuber Mar 28 '21 at 20:54
  • Thanks for your attention I will thinker more and perhaps smart update in the next days. – Good Luck Mar 28 '21 at 20:57