9

Say I have a random vector $Y\sim N(X\beta,\Sigma)$ and $\Sigma\neq\sigma^2 I$. That is, the elements of $Y$ (given $X\beta$) are correlated.

The natural estimator of $\beta$ is $(X'\Sigma^{-1}X)^{-1}X'\Sigma^{-1}Y$, and $\text{var}(\hat{\beta})=(X'\Sigma^{-1}X)^{-1}$

In a design context, the experimenter can fiddle with the design which will result in different $X$ and $\Sigma$ thus different $\text{var}(\hat{\beta})$. To choose an optimal design, I see that people often try to minimizes the determinant of $(X'\Sigma^{-1} X)^{-1}$, what is the intuition behind this?

Why not, say, minimizes the sum of its elements?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
qoheleth
  • 1,210
  • 1
  • 12
  • 25

1 Answers1

13

As a design criterion, to minimize the determinant of $(X'\Sigma^{-1} X)^{-1}$, which is the same as maximizing the determinant of $(X'\Sigma^{-1} X)$, is known as D-optimal experimental design. The determinant of a covariance matrix is known as the generalized variance, so we are minimizing the generalized variance. Other functionals of the covariance matrix could be used as a criterion, but what you propose (minimizing sum of its elements) does not make much sense. The D-optimality criterion has the big practical advantage of being invariant under linear transformations of the regressor variables, which is a big practical advantage. Invariance means that the optimality is not influenced by such things as choice of measurements units, (such as m or k m). With non-invariant optimality criteria the result could depend on such irrelevant things as choice of measurement units.

If you search this site for "D-optimal" you will find other relevant posts!

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • 1
    Nice answer. Maybe one thing to add would be the A-optimality criterion, which is the trace of the var-cov matrix, so here we are minimizing the sum of the variances. This goes a bit in the direction of what the OP was asking about. – Wolfgang Sep 02 '14 at 11:28
  • Wolfgang: Yes, but the trace (A))-optimality criterion is still not invariant! But it can be used, withy care ... – kjetil b halvorsen Sep 02 '14 at 11:35
  • Right, good point. – Wolfgang Sep 02 '14 at 11:37
  • 1
    As far as I can tell, this answer only provides one motivation for D-optimal design: that it is invariant under linear transformations. While this is a **nice** feature, to me it doesn't appear to really motivate why one should use D-optimal; plenty other metrics are also invariant under linear transformations **and** are tied to real questions of interest, such as minimizing the variance of an estimator of a fixed contrast of interest. I've often wondered why people use D-optimal and haven't been able to come up with a good reason! – Cliff AB Mar 05 '19 at 21:59
  • @Cliff AB: I will try to augment the answer – kjetil b halvorsen Mar 05 '19 at 23:27
  • @kjetilbhalvorsen So is the intuition is that for an experiment design to estimate some parameter $\theta$, we 'look' for data points to sample that will result in low variance in the estimation of the parameter i.e. $\hat{\theta}$ right? – GENIVI-LEARNER Apr 07 '20 at 21:20