3

I am trying to find dietary patterns related to a disease outcome. Unfortunately, I only have the binary outcome "disease yes/no" as outcome. I tried to perform PCA on the data, but the dietary patterns are not specific enough for my outcome. Therefore, somebody suggested me reduced rank regression, which is more related to the disease outcome.

Is it possible to perform reduced rank regression on a binary response variable? If not, are alternatives available?

amoeba
  • 93,463
  • 28
  • 275
  • 317
user131483
  • 33
  • 4
  • 1
    My predictors are dietary items (~20 items). I would like to find dietary patterns that are associated with my disease outcome. – user131483 Sep 19 '16 at 07:15

1 Answers1

2

It does not make sense to use reduced-rank regression with a binary dependent variable.

Reduced-rank regression is usual regression with a rank constraint on the coefficients matrix. It only makes sense for multivariate regression, i.e. for regression with multiple response variables. If e.g. there are $p$ predictor variables and $q$ response variables, then the matrix of regression coefficients is $p\times q$, and a rank constraint can be non-trivial. If there is only one single response variable, then the "matrix" of regression coefficients is just a $p\times 1$ vector which has rank $1$. So its rank cannot be further constrained at all. Compare:

$$\mathbf Y = \mathbf X \mathbf B + \epsilon,\quad \mathbf B\in\mathbb R^{p\times q}$$ $$\mathbf y = \mathbf X \boldsymbol \beta + \epsilon,\quad \boldsymbol\beta\in\mathbb R^p$$

If your outcome variable is binary, it can be coded as a sequence of $0$s and $1$s, making the response variable one-dimensional. So RRR cannot add anything here. You will be just running ordinary regression.

(Which can be fine: for two classes, regression is equivalent to linear discriminant analysis (LDA). You just need to make sure that you are not overfitting, and use a regularization if needed.)


Regarding alternatives, nowadays one of the most standard approaches in your situation would be logistic regression regularized with elastic net.

amoeba
  • 93,463
  • 28
  • 275
  • 317