I want to compute the mutual information between features and my output variable. I was wondering, what is the best way to set the number of bins for each feature? How should be bins interval ? Does all of them should have same bin size? Here is how my data looks like:
A B C D E
0.30890524 0.54426331 0.100881953 1.7844281 1.9580541
0.1037904 0.10647233 0.102095382 0.8488240 0.7763768
0.10367904 0.147233 0.102095382 0.8488240 0.7763768
0.331458 0.57973406 0.130334158 1.6350764 1.5585344
0.15101780 0.2377797 0.150907454 0.8556408 1.0199345
0.14075664 0.04942940 0.0103453 0.7010386 0.523710
6 0.02547862 0.01841224 0.04950307 0.1694650 0.1293436
0.31318298 0.123281 0.387902840 1.013703 0.9320006
0.545757 0.5898526 0.242701313 1.8950583 2.1294465
0.332576 0.44881516 0.181627835 1.5116738 1.5444081