Statistical conclusions based on conditional trees

Question

I have a complex dataset, number of features is much bigger than number of samples. The question is - which features are important for classification into 2 groups.

I think that (after some engeneering of features taking into account possible interactions) ctree is a good instrument for doing this. However I need to present results in a paper.

Do I need to cross-validate ctree in order to be able to present some "significance", e.g. "feature X appears 10 times out of 12 as a root split - may be it is important"? I would go with random forest feature importance (and shuffle the labels to find p-values), but as far as I know RF is parametric and ctree is non-parametric which is preferable...

To my knowledge random forest is non-parametric, why would you assume otherwise? — Scholar, Feb 01 '19 at 10:40
but are not the decision trees that random forest build based on parametric assumptions? I am sure that regression trees yes, each split is performed according to distribution of residuals. I also know that - theoretically - random forest can be built based on any type of trees, ctree also, but I do not know where it was implemented... — German Demidov, Feb 01 '19 at 11:25
Perhaps you should step back and look at the bigger picture: parametric vs non-parametric statistics: https://projecteuclid.org/download/pdf_1/euclid.ss/1009213726 — Peter Teoh, Feb 02 '19 at 03:29
https://stats.stackexchange.com/questions/147587/are-random-forest-and-boosting-parametric-or-non-parametric — Peter Teoh, Feb 02 '19 at 03:30
@bi_scholar I was sure that the split at each point is performed according to some metric such as RSS in case of continuous output - I was wrong, sorry — German Demidov, Feb 06 '19 at 08:15
@GermanDemidov that is indeed the case, but that doesn't make the algorithm parametric, as it does not imply that any particular distribution of the data-generating process is assumed. In theory, random forests can model any distribution, while (parametric) models such as LDA can not. — Scholar, Feb 06 '19 at 10:22
@bi_scholar yeap, agree, so the crucial mistake in my question was "non-parametric" instead of "robust to outliers" =( my fault. But thank to you and Peter Teoh I understand the definitions much better now... — German Demidov, Feb 06 '19 at 11:43

Statistical conclusions based on conditional trees

0 Answers0