During my bachelor thesis I gathered a bunch of comments, labeled with a 0
for containing no hate and 1
for containing hate. The labels where given by volunteers. The around 2500 comments are of various lengths (between 100 and 1800 characters).
Now after my thesis I stepped over visualization techniques like PCA and t-SNE. Applied to the MNIST dataset of handwritten digits these techniques show amazing results.
As I understand it, a comment consists of words and somehow it is high dimensional data as the images in MNIST are. Because of that: Is it possible to visualize the comments with a technique as PCA or t-SNE?
I don't know how I could convert the data or where I can find a tutorial which applies text to such a technique. Thanks for your thoughts!