I would like to train / test split a dataset in such a way that all categories of categorical variables are in both train and test split.
I tried ( using sickit learn ) :
df_moto_train , df_moto_test = train_test_split( df_moto , test_size = 0.15 , stratify = df_moto[ cols_obj ] )
( where cols_obj is a list of categorical variables from the dataframe df_moto )
but I got the message :
ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.
Thanks.