I'm struggling to prevent overfitting with my model and need some clarification if any one can help.
I started with 5 folds and it didn't seem to help when running hyper parameter tuning.
I switched to 10 folds which I thought might make a difference but alas no, it's hardly effected my overfitting.
I then decided to try 10x10 folds which has helped somewhat but not enough.
Is it wrong to add the min() score into the mean calculation at the end to help lower variance ? i.e the tuning would pick a score with a higher score as a low min() would drag the average down.Would that even help?
Like this [.75,.78,.65,] would be worse than [.75,.78,.69]
How about some sort of calculation including the stdev of each fold to try and bring the values closer to the mean? I.e giving a negative weight to scores with higher stddev?
Is this even a thing ? I've never read about it!!
Also what is an acceptable difference between the test and training score ? I'm using accuracy currently and experience 4-5% variance between the two. Is this normal and am I picking hairs?