I've added new attributes to the binary GLM model. AUC climbed to 98%, logistic loss decreased to 0.45. Training set has ~50 cases.
I can see that predicted probabilities are extremely close to 0 and 1 (max f2 threshold from 8-fold cross validation is very close to 1)
My model has 11 attributes including g1. Below are the model predictions for changed g1 value (g1.val) and fixed values of the other attributes:
+---------+--------+---------------+----------+--------+
| predict | p.NORM | p.C1 | StdErr | g1.val |
+---------+--------+---------------+----------+--------+
| NORM | 1 | 2.200799ep-37 | 445.9396 | 19 |
| NORM | 1 | 3.609197e-37 | 452.2013 | 20 |
| NORM | 1 | 5.918897e-37 | 459.9089 | 21 |
+---------+--------+---------------+----------+--------+
When I remove 2 new attributes (t1, t2) predictions (raw probabilities) of the modified model looks better:
+---------+-----------+--------------+----------+--------+
| predict | p.NORM | p.C1 | StdErr | g1.val |
+---------+-----------+--------------+----------+--------+
| NORM | 0.999481 | 0.0005190334 | 3.068864 | 19 |
| NORM | 0.9991453 | 0.0008547492 | 2.96949 | 20 |
| NORM | 0.9985927 | 0.001407304 | 2.901418 | 21 |
+---------+-----------+--------------+----------+--------+
Predictions of the modified model are not very close to 0 and 1, but AUC decreased to 88% and logistic loss increased slightly.
What is the reason for such predicted probabilities change?
I suppose that t1, t2 are not more greatly important attributes than the other. Also I can't find description of StdErr for h2o.predict() results in documentation/examples.