Instead of looking to compare different models on a metric that is scaled similarly (i.e., estimate two different models and compare Nagelkerke $R^2$ values), I would advise that choosing a model where both prediction equations can be incorporated simultaneously and rescaled directly to assess the relative impact of the predictive factor on fit to the outcome would be more comparable.
Consider a set of models like these quite similar to the variables discussed in the original post:
lm(mpg ~ cyl, data = datasets::mtcars)
glm(vs ~ cyl, data = datasets::mtcars, family = binomial(link = "probit"))
Using the {pscl}
package's pR2
metrics the $R^2$ metrics are fairly inconclusive and across the 3 $R^2$s (i.e., "McFadden", "r2ML", and "r2CU"). Unfortunately, mpg
is scaled very differently than is vs
which is clear from the very different underlying likelihoods reported on below.
> lm(mpg ~ cyl, data = mtcars) |> pscl::pR2()
fitting null model for pseudo-r2
llh llhNull G2 McFadden r2ML r2CU
-81.6532086 -102.3777581 41.4490989 0.2024321 0.7261800 0.7273903
> glm(vs ~ cyl, data = mtcars, family = binomial(link = "probit")) |> pscl::pR2()
fitting null model for pseudo-r2
llh llhNull G2 McFadden r2ML r2CU
-8.8896779 -21.9300546 26.0807534 0.5946349 0.5573711 0.7470935
If, however, both models are included in a single model and a standardized metric of their effect can be computed, the results, I would argue, are far more comparable.
For instance {lavaan}
(https://lavaan.ugent.be/) can accommodate both linear and probit models and offers methods that can standardize the results to assist in their comparability. Specifically, {lavaan}
estimates covariance structure/structural equation models as that allow for producing a standardized solution by standardizing the results by the estimated observed or latent variances of the outcomes assuming both follow an underlying Normal distribution.
> lavaan("mpg ~ cyl; vs ~ cyl", data = mtcars, ordered = "vs") |> standardizedsolution()
lhs op rhs est.std se z pvalue ci.lower ci.upper
1 mpg ~ cyl -0.982 0.004 -239.905 0.000 -0.990 -0.974
2 vs ~ cyl -0.846 0.089 -9.460 0.000 -1.022 -0.671
3 vs | t1 -2.743 0.733 -3.744 0.000 -4.178 -1.307
4 mpg ~~ mpg 0.037 0.008 4.547 0.000 0.021 0.052
5 vs ~~ vs 0.284 0.151 1.872 0.061 -0.013 0.580
6 cyl ~~ cyl 1.000 0.000 NA NA 1.000 1.000
7 vs ~*~ vs 1.000 0.000 NA NA 1.000 1.000
8 mpg ~1 0.000 0.000 NA NA 0.000 0.000
9 vs ~1 0.000 0.000 NA NA 0.000 0.000
10 cyl ~1 3.465 0.000 NA NA 3.465 3.465
In particular, lines '1' and '2' provide a standardized metric for the effect of cyl
on both mpg
and vs
that can be interpreted in the same way---the standard deviation change in the outcome (in the case of vs
on the underlying latent variable) given a standard deviation change in cyl
.
Here, it is clear that cyl
does a slightly better in terms of the magnitude of the coefficient when compared as standardized by their estimated variances.
I anticipate a similar approach could be adopted for your situation to more directly compare these two predictive models/equations.
Appendix
Note that lavaan
s unstandardized results match those obtained from the lm
and glm
models.
> lavaan("mpg ~ cyl; vs ~ cyl", data = mtcars, ordered = "vs") |> coef()
mpg~cyl vs~cyl
-2.876 -0.890
> lm(mpg ~ cyl, data = mtcars)
Call:
lm(formula = mpg ~ cyl, data = mtcars)
Coefficients:
(Intercept) cyl
37.885 -2.876
> glm(vs ~ cyl, data = mtcars, family = binomial(link = "probit"))
Call: glm(formula = vs ~ cyl, family = binomial(link = "probit"), data = mtcars)
Coefficients:
(Intercept) cyl
5.15 -0.89
Degrees of Freedom: 31 Total (i.e. Null); 30 Residual
Null Deviance: 43.86
Residual Deviance: 17.78 AIC: 21.78
```