For concreteness:
library( mgcv )
set.seed( 1 )
RawData <- data.frame( y = rbinom( 1000, 1, 0.5 ), x1 = rnorm( 1000 ),
x2 = as.factor( rbinom( 1000, 1, 0.5 ) ), x3 = rnorm( 1000 ),
x4 = as.factor( rbinom( 1000, 1, 0.5 ) ) )
fit <- gam( y ~ s( x1 ) + x2 + s( x3, by = x2 ) + x4, data = RawData,
family = nb( link = log ) )
How to measure the importance of these four variables?
I understand that "variable importance" is not a well-defined concept, so I am looking for the most straightforward way, such as an explained variance approach.
The ANOVA table seems to be a natural choice, however, as explained in this answer, it is not working: for the smooth terms in GAM models they do not have an explained variance interpretation.
What is the sound approach then?