I'm trying to build a training set for a classifier.
A vector evaluates to either conclusive 'C' or unconclusive 'U'.
U Y69S 12 -1.5 1.83 3.45 5.412 6.441 9.864 14.666 15.68 12.082 8.384 4.016 0.0
U Y69T 12 0.904 1.699 3.672 6.543 7.642 10.435 16.099 16.604 13.411 8.916 5.427 0.0
C Y69V 12 -0.293 2.192 4.202 5.835 7.97 10.467 16.623 16.588 13.109 8.209 4.192 0.0
C Y69W 12 -6.65 -7.501 -6.627 -4.786 -5.456 -2.025 1.883 14.33 10.738 6.658 7.978 0.0
C Y80A 12 1.505 0.597 2.105 4.901 5.007 9.476 13.273 14.413 11.049 6.402 2.726 0.0
U Y80C 12 0.633 -0.558 0.328 3.899 5.734 7.99 13.345 14.463 10.246 4.905 1.134 0.0
C Y80D 12 4.928 6.02 6.754 9.612 12.618 17.849 17.876 17.605 12.73 7.035 2.059 0.0
U Y80E 12 -0.772 -1.421 0.855 2.469 7.932 16.783 16.341 15.808 12.597 8.455 4.644 0.0
C Y80F 12 0.311 -1.267 -0.332 3.294 5.497 8.231 11.756 13.57 9.524 5.054 1.777 0.0
U Y80G 12 -0.023 -0.346 1.376 4.351 4.044 8.748 12.373 15.347 10.454 6.044 2.55 0.0
C Y80H 12 -2.762 -4.235 -3.276 -0.661 1.749 5.74 10.979 13.685 9.291 6.207 1.279 0.0
When preparing the data set, should I include roughly equal amounts 'C' and 'U' values?