I have a dataset which contains vectors of different features that generated from subtitles in movies, something like:
Comedy Disaster Romance...
Movie1 0.037283 0.28866 0.36253
Movie2 ...................
I want to use cosine similarity, but before that, I tried to scale the vectors by row and also tried to normalise the data by column, and the similarity results are different, I don't know what is the difference, is there any paper I can have a look?
I used 'scale()' to scale the vectors by row, and tried 'preprocessing.MinMaxScaler()' to normalise the data bu each column, not sure if it is correct. Could someone tell me the difference and which method is better for my case?
Many thanks.