I'm making a random forest classifier.
In every tutorial, there is a very simple example of how to calculate entropy with Boolean attributes.
In my problem I have attribute values that are calculated by tf-idf schema, and values are real numbers.
Is there some clever way of applying an information gain function so it will calculate IG with real-number weights, or should I use discretization like:
0 = 0
(-0 - 0.1> = 1
(-0.1 - 0.2> = 2
etc.
EDIT
I have function:
$$ IG(X) = E(C) - E(C,A) $$
$$ E(C) = \sum\limits_{i=1}^C-P(c_i) * log(P(c_i)) $$
and
$$
E(C,A) = \sum\limits_{a\in A}P(a) * E(a)
$$
The problem is iI have infinite number of possible values of $$ A $$ and i think, that I should perform dicretization of these values, shouldn't I?