I just started getting involved with Machine Learning and I decided to create a spam filter for my social app, using the Naive Bayes classifier. I'm following this guide: https://hackernoon.com/how-to-build-a-simple-spam-detecting-machine-learning-classifier-4471fe6b816e
My app has ~70,000 posts and about 3,000 of them are marked as spam. How many of my non-spam posts should I use to train my model?