Skip to content

Results of running SentimentAnalyzer() on Amazon Reviews Dataset. #187

@mdvsh

Description

@mdvsh

Background

Under Google Code-In, I used the sentiment analysis model in TextAnalysis.jl to analyse the amazon reviews dataset.. I performed basic text pre-processing to increase the metrics of the model. Some tasks undertaken were:

  • stemming words in each review
  • removing corrupted characters
  • removing definite and indefinite articles

I also found that remove_numbers!() (another pre-processing function mentioned in the Docs) gave an error on running. On further inspection, I found that it isn't still implemented in the src/preprocessing.jl folder. It is an issue worth looking into.

Also, a BoundsError occurred in the midst of the run.

BoundsError: attempt to access 32×5000 Array{Float32,2} at index [Base.Slice(Base.OneTo(32)), 5001]

This didn't effect the running and the results that I get are presented below.

Result

I learnt how precision, recall and f1score are different metrics for measuring how well the model performs and was a wonderful learning experience.

Precision : 0.583117838593833
Recall : 0.5144996465068449
F1Score : 0.5466638895622987

Related To:

  1. BoundsError in sentiment analysis #160
  2. Testing the efficiency of Sentiment Analysis models #185
  3. Better sentiment analysis model #84

cc @Ayushk4 @aviks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions