2

I'm a newbie here in the forum and new to text analytics using Python and R. My question is somewhat similar to Is there a better approach than counting positive-negative words in sentiment analysis?

I'm working on a dataframe with 2000 rows of product reviews of SalesForce One app. and want to analyze sentiments. (Pos. or Neg.)

I would like to analyze each review as a whole sentence and not based on a single word per review. Reproduced the same code from the below link, but the results are highly deviated as this approach is analyzing only one word per review using AFFIN lexicon.

Link: Does sentiment analysis work? A tidy analysis of Yelp reviews

The NLTK approach in python seems to fit my problem, but unable to understand the complex code to apply to my problem.

Sentiment Analysis

Please recommend a package/module in either R or Python which can help to analyze each review as a whole sentence. (will be glad if some rough code is provided).

2 Answers2

3

I think you answered your own question. NLTK is what you need. If you provide some example sentences it might help.

But from the link you provided to NLTK, I just reproduced their examples:

  1. Install the VADER lexicon used:

    >>> import nltk
    >>> nltk.download()
    

    A dialog pops up. Chose models and vader_lexicon and download that.

  2. Import the analyzer and input your sentences

    >>> from nltk.sentiment.vader import SentimentIntensityAnalyzer
    >>> sentences = ["The book was good.",         # positive sentence
    ... "The book was kind of good.", # qualified positive sentence is handled correctly (intensity adjusted)
    ... "The plot was good, but the characters are uncompelling and the dialog is not great.", # mixed negation sentence
    ... "A really bad, horrible book."]       # negative sentence with booster words
    
  3. Run the analysis

    >>> sid = SentimentIntensityAnalyzer()
    >>> for sentence in sentences:
    ...     print(sentence)
    ...     ss = sid.polarity_scores(sentence)
    ...     for k in sorted(ss):
    ...         print('{0}: {1}, '.format(k, ss[k]), end='')
    ...     print()
    

    and you get the scores:

    The book was good.
    compound: 0.4404, neg: 0.0, neu: 0.508, pos: 0.492,
    The book was kind of good.
    compound: 0.3832, neg: 0.0, neu: 0.657, pos: 0.343,
    The plot was good, but the characters are uncompelling and the dialog is not great.
    compound: -0.7042, neg: 0.327, neu: 0.579, pos: 0.094,
    A really bad, horrible book.
    compound: -0.8211, neg: 0.791, neu: 0.209, pos: 0.0,
    
KPLauritzen
  • 221
  • 1
  • 5
0

I suggest you to use more tools, like textWiller package, LDA and LSA (LATENT SEMANTIC ANALYSIS I R). There is a good software: raoidminer, it is really intuitive smart and agile. I suggest you to download it and try the trial within

Dalila
  • 13
  • 1
  • 1
  • 7
  • Welcome to this site! Please, do not provide multiple answer when they are related. Instead, you should provide a valid URL, a short description of what the software does and ideally why you recommend them. See our [Help Center](http://stats.stackexchange.com/help/how-to-answer) to learn more. – chl Nov 15 '16 at 13:40