Does the 'Table 2 fallacy' render results from penalised regression with many variables uninterpretable?

Question

A reviewer recently took to my paper with a baseball bat suggesting that the 'Table 2 Fallacy' rendered my results essentially uninterpretable.

In the paper I used penalised regression (lasso) to assess what factors were important predictors of meeting criteria for cannabis use disorder. We had seventeen predictors and ~900 observations. The study was cross-sectional.

This was the first I had heard of the Table 2 Fallacy, so I read about it online here and at CV here. It seems that the gist of the argument is that (a) you need some sort of structural equation modelling approach to properly make sense of the relationships between predictors and outcome variables (b) penalised regression and statistical learning techniques are only of use for prediction and not so much for explanation. This was a bit dispiriting to me because I understood that penalised regression was a more ethical way to identify predictors of a particular outcome in the absence of a unifying explanatory model. Also it seems to me that structural equation modelling is only useful with a manageable number of covariates, certainly a lot less than 17. Yet the 17 we identified and added to out model have all been found to be associated with cannabis use disorder throughout the literature.

So my questions are:

1. Is the reviewer right? Does the absence of structural equation modelling render my results meaningless?

2. Under what circumstances would penalised regression with a large number of predictors have validity as a tool for explanation (i.e. rather than merely prediction)? (i.e. how do the myriad of papers using penalised regression get around this problem? Are their better analyses we could conduct?)

There is an important difference between finding *predictors* of the outcome and finding a causal relationship between them. The latter would require something like SEM. Perhaps your wording was too suggestive of causal relationships, rather than just associations? If you could let us in on what part of the paper *specifically* the reviewer disagreed with, it would be easier to answer. — Frans Rodenburg, May 24 '21 at 07:31
Thank you for replying @Fran Rodenberg, you actually set my mind at ease somewhat. The reviewer actually made several comments suggesting that we tidy up our language around the X-variables and not use the word 'predictors'. They actually suggested using 'correlates' instead, which we did in several places but I think a couple of 'predictors' slipped through. So your hunch that the reviewer was probably saying that we couldn't make causal inferences with the penalised regression, rather than saying that it was totally invalid, was correct I think. — llewmills, May 25 '21 at 02:55
The comments itself is actually quite vague (and long). Here it is: *"7. ‘Table 2 Fallacy’ The authors are to be congratulated for putting a harness on a recently introduced horseshoe regression modeling approach, but the variance inflation factor approach does not address what has come to be known as a ‘Table 2’ fallacy in research of this type. For example, let’s consider being binary male versus binary female as a predictor of a CUD experience..."* — llewmills, May 25 '21 at 02:57
continued. *"And let’s consider a possibility of male-female variations in the reinforcing functions served by cannabinoid self-administration, followed by a greater number of days of cannabis use for male users versus female users, followed by a more rapid time to becoming tolerant to the cannabinoid effects. Given this DAGs specification, would we want to make a covariate adjustment for ‘days of cannabis use’ when we are studying the male-female differences whether tolerance is observed sooner rather..."* — llewmills, May 25 '21 at 02:58
continued *"...than later after onset of cannabis use? The answer would seem to be that it depends upon what you are trying to do. The most important ‘predictor’ of this response might be being male or being female, and this would be seen in a Table 1 (without ancillary covariates for each X-variable), but the importance of that fundamental relationship would not be disclosed if the Table 2 estimate of the male-female association is covariate-adjusted for ‘days of cannabis use.’* — llewmills, May 25 '21 at 02:58

Does the 'Table 2 fallacy' render results from penalised regression with many variables uninterpretable?

0 Answers0