Akaike Information Criteria applied on Random Forest

Question

I am implementing a Random Forest model for predicting a variable "A" which is function of other 4 variables: $$A = f(B,C,D,E)$$ I developed a good RF model (i.e. high accuracy, good generalisation and extrapolation capabilities...), but I want to compare the accuracy of the results with a linear regression model: $$A=2B+3C+4D+5D+6E+ao$$ I have used the Nash-Sutcliffe eficiency coefficient (NSE) and the Mean Absolute Percentage Error (MAPE) for comparing the accuracy of RF and linear regression models. However, I would like to use the Akaike Information Criteria (AIC). Is it possible to use AIC for this purpose? Thanks in advance.

Tim · Answer 1 · 2020-09-08T11:25:02.013

AIC is defined as

$$ \text{AIC} = 2k - 2\ln(\mathcal{L}) $$

where $k$ is the number of parameters and $\ln(\mathcal{L})$ is log-likelihood. First of all, random forest is not fitted using maximum likelihood and there is no obvious likelihood function for it. Second problem is the number of parameters $k$, for linear regression this is simply the number of $\beta$ parameters, but what would it be for random forest? Would it be number of trees, maybe their depth, maybe number of splits, all of the above? You use the number of parameters $k$ to penalize the model, so if you have $m$ features for linear regression with no interaction terms and intercept, $k = m+1$, but if you used something else for random forest, then the penalty could be much higher, but would it fair? If you want to penalize random forests for complexity vs linear regression, then they will always many orders of magnitude more complex, so this doesn't seem to be a meaningful in here. This applies to many other machine learning models as well, since it is often not obvious how would we measure their complexity, hence it is hard to come up with a penalty for that.

Thank you for your valuable reply! I fully agree with your answer. There is not a way for comparing two models (using AIC) since the parameters are not comparable. Thanks a lot! Regards. — cdmon, Sep 08 '20 at 15:19
Also note that the AIC is typically used to assess the in-sample model fit (hence the need to correct for the degrees of freedom in the model). You would probably do best to assess the linear model out-of-sample, e.g. through cross validation, just like you assess your Random Forest. If you use MAPE for the linear model's in-sample predictions, they will be slightly overfit because you used the same data for estimation as for evaluation. Evaluating both models out-of-sample is a solution to the problem of having to estimate the complexity in a ML model, which is hard as mentioned above. — Mark Verhagen, Jan 07 '22 at 17:14

Akaike Information Criteria applied on Random Forest

1 Answers1

Linked