Post-Pruning in partykit: the size of mob() tree

Question

I am trying to build a multiple regression model while partitioning my data into subgroups based on additional set of covariates. While I implemented lmtree() or mob() in the "partykit" package, I tried to understand post-pruning strategies using AIC and BIC criterion, but I need some helps!

In the lmtree(), we can see the functions below:

    "aic" = {
  function(objfun, df, nobs) (nobs[1L] * log(objfun[1L]) + 2 * df[1L]) < (nobs[1L] * log(objfun[2L]) + 2 * df[2L])
}, "bic" = {
  function(objfun, df, nobs) (nobs[1L] * log(objfun[1L]) + log(nobs[2L]) * df[1L]) < (nobs[1L] * log(objfun[2L]) + log(nobs[2L]) * df[2L])
}, "none" = {
  NULL

To understand how these functions cut some child nodes, I first grow a very large tree with control = mob_control(verbose=TRUE, ordinal = "L2", alpha=0.5) and save the results of AIC, nobs, logLik, and df values of each of the nodes (I saved these values to calculate the above AIC function manually):

Then, I fit another lmtree() function with mob_control(verbose=TRUE, ordinal = "L2", alpha=0.5, prune="AIC") to see which child nodes were cut. This results in a smaller tree without the nodes 4,5,8,9,10,11,14,15,19,20,24,25,26,27 from the first large tree.

I tried to calculate the AIC criterion value with the above table, e.g., starting from nodes 19 and 20, comparing node 18. However, as I kept pruning the tree from the bottom, it seems that the calculation in lmtree() are not always correct... Can you clearly explain what are the objfun[2], nobs[2], and df[2] in the AIC and BIC functions? For example, after I cut the nodes 10 and 11, how I can decide to keep the nodes 8 and 9 comparing with node 7?

Thank you so much for your time in advance!

score 2 · Accepted Answer · answered Nov 14 '17 at 02:06

Disclaimer: I can't use your example because it is not reproducible and it isn't clear to me how exactly you have set up the table with the log-likelihoods. It seems that the log-likelihoods are all evaluated at the full parameter values (of the large tree) and not the restricted parameter values in the inner nodes. But, again, I cannot verify this with the information provided.

For a reproducible example consider the following simple example on the cars data:

library("partykit")
m <- lmtree(dist ~ 1 | speed, data = cars, alpha = 0.5, prune = "AIC")
plot(m)

And we extract the models in all nodes of the tree:

ms <- refit.modelparty(m)

Now let's check why the split of node 2 into nodes 3 and 4 is kept. The AIC of the model in node 2 is:

AIC(ms[["2"]])
## [1] 265.2902
-2 * as.numeric(logLik(ms[["2"]])) + 2 * 2
## [1] 265.2902

The AIC of the combined nodes 3 and 4 is:

-2 * as.numeric(logLik(ms[["3"]]) + logLik(ms[["4"]])) + 2 * (2 + 1 + 1)
## [1] 247.7727

Thus, the split improves the model and is not pruned. Note that the parameters of the 3/4 models use two separate means, a single error variance, and one additional estimated breakpoint. One could compute this differently, e.g., with two variance or a different penalty for the additional breaks etc. The partykit package offers a couple of variants for this.

Thank you so much for your very clear answer! I also saw the equation that you showed here in https://cran.r-project.org/web/packages/partykit/vignettes/mob.pdf, p. 12, but I was wondering the term "nobs[1L]" or "nobs[2L]" in the lmtree() in the partykit package. — sunmee, Nov 14 '17 at 15:32
For each split these quantities are computed as follows: `objfun[1]`, `nobs[1]` and `df[1]` are simply the objective function, sample size, and number of estimated parameters in the mother node. `objfun[2]` is the _sum_ of the objective functions _across_ daughter nodes. Similarly, `df[2]` is the _sum_ of number of estimated parameters (optionally plus a penalty for the split itself). And `nobs[2]` is the _sum_ of sample sizes (which, of course, is just the sample size of the mother node). — Achim Zeileis, Nov 16 '17 at 11:44

Post-Pruning in partykit: the size of mob() tree

1 Answers1