0

I have analysed my data using multivariate multiple regression (8 IVs, 3 DVs), and significant composite results have been found.

Before reporting my findings, I want to discuss in my results chapter (briefly) how the composite variable is created.

I have done some reading, and in the sources I have found, authors simply state that a 'weighted linear composite' or a 'linear combination of DVs' is created by SPSS (the software I am using).

They do not explain how they are weighted, and as someone relatively new to multivariate statistics, I am still unclear.

Are the composite DVs simply a mean score of the three DVs I am using, or is a more sophisticated method used on SPSS?

If the latter is true, could anyone either a) explain what this method is, or b) signpost some useful (and accessible) readings which explain the method of creating composite variables?

Many thanks

enter image description here

enoon
  • 43
  • 5
  • What procedure (command name) are you using? What spss version? – ttnphns Apr 24 '20 at 12:03
  • Hi @ttnphns - I am using the GLM following the syntax found on this link https://www-01.ibm.com/support/docview.wss?uid=swg21476743 and I am using SPSS 24. Thanks – enoon Apr 24 '20 at 12:13
  • By "composite variable" you probably mean the prediction made by the model, the linear combination of IVs. Just you may look in SPSS Algorithms, GLM, Multivariate, if tou need concrete formulas. Note that because the predictand is multivariate, the formulas are likely to be written in matrix notation. – ttnphns Apr 24 '20 at 12:23
  • Spss Algorithms document is under Help menu, it is also dowloadable from internet in pdf format. – ttnphns Apr 24 '20 at 12:27
  • Thank you for the prompt response @ttnphns Below my original post, I have taken a screenshot of the specific results I am trying to interpret. My understanding is that these results indicate the extent to which there is a relationship between each IV and a composite of my DVs Thus, I am just trying to determine how the DV for these tests were calculated I will check out the guidance, thanks – enoon Apr 24 '20 at 12:48
  • The output you are showing is the different multivariate tests. This table is analogous to the ANOVA table (showing the significance of univariate F test) of univariate regression or ANOVA. – ttnphns Apr 24 '20 at 12:52
  • These tests are described briefly in this my answer https://stats.stackexchange.com/a/255444/3277. (However, there are given formulas for the case of one categorical predictor.) – ttnphns Apr 24 '20 at 12:56

1 Answers1

0

Multivariate tests in GLM are created using the multivariate general linear hypothesis form $L\beta M = K$ where $L$ is a matrix of contrast coefficients applied to the predictors, $\beta$ is the matrix of population regression coefficients, $M$ is a transformation matrix of the dependent variables, and $K$ is the matrix of null-hypothesized values for $L\beta M$, typically (and by default) $0$. As @ttnphns indicates, that these are matrices rather than scalars.

In the case of these particular tests, $L$ is actually a vector $l$ consisting of 0 values for the intercept and all but one of the predictors, which has a 1. $\hat{\beta}$, or the estimated matrix of regression coefficients, is used for $\beta$, and $M$ is an identity matrix of order equal to the number of dependent variables, in your case three.

The regression of each dependent on the set of predictor variables is the same here as if you were to run each one separately. The same regression coefficients and univariate test statistics will result either way. The null hypotheses tested in the multivariate case are simply the intersections of the univariate null hypotheses: the population $\beta$ for a given predictor is 0 for each of the dependent variables.

What's done in the multivariate case is simply to pool information about the correlations among the dependent variables to create more powerful tests of the same hypotheses you'd test about each dependent variable separately.

David Nichols
  • 840
  • 2
  • 9