Sentiment Analysis on Farmer’s protest:

(DataByte Task 2)

Abirath Raju
5 min readJan 23, 2021

I chose to work with Textblob library in python for this NLP based project.I collected 50 tweets from various twitter handles pertaining to #Farmer’s protest to perform analysis on.

Some of the tweets from the dataset are as follows:

As seen,this raw text needs to be cleaned before it can be passed through the NLP algorithm.It contains “@” mentions ,# hashtag symbols and even hyperlinks in certain cases,and it is important to remove these from the text before proceeding to the next step.The code I’ve written for this is as follows:

I’ve declared a function TextCleaner and inside this function I’ve used the Regular Expression(re) library to detect @,#,hyperlinks and replace them with a blank space.Executing this piece of code gives the cleaned text as follows:

Next,I calculated the Subjectivity and Polarity values for each tweet and stored them as separate columns in the dataframe.Polarity refers to how positive or negative the particular tweet is and the value it returns lies in the range[-1,1] whereas Subjectivity refers to how opinionated the tweet is and varies between[0,1].More the value, more subjective is the tweet. The code implementation is as follows:

Printing the values,

Now,coming to the Mathematics behind how TextBlob calculates these values:

If it recognizes negation such as “not” infront of a word,it multiplies a polarity of -0.5,but doesn’t change the subjectivity.Apart from these two attributes, there is “Intensity” which gets changed if there is a modifier like “very” infront of a word.So if normally, a word like “good” has a polarity of 0.8 and subjectivity of 0.56 but for “very good” it becomes 1 and 0.56*1.3=0.728 respectively.1.3 is the multiplier that is used for “very” by TextBlob.

Now,if a negation is present in front of a modifier such as “not very”,then both the effects are compounded.The multiplier of -0.5 is done and the inverse of 1.3 is also multiplied.

So,while assigning the polarity and subjectivity of the tweet,TextBlob assigns these values to individual phrases in the sentence and averages out a single value for the entire sentence using the contributions from individual phrases.

So,if the polarity is greater than 0,the tweet is positive,neutral if 0 and negative if its less than 0.Code I wrote is as follows:

OUTPUT:

This shows the sentiment of the first 5 tweets.

DATA VISUALISATION:

1.WORD CLOUD:

This is a method to represent the most common words in the tweets as big texts, and as the frequency of the words used decreases,size of the word in the image also decreases.

CODE:

Here,I’ve created a stopwords list which consists of some of the trivial words that I don’t want to appear in the wordcloud.Executing this gives,

2.SCATTER PLOT:

This is a plot showing the number of tweets having different values of polarity and subjectivity.

CODE:

PLOT:

INFERENCE:

All the points to the right of ‘0’ Polarity represent positive tweets,all to the left represent negative tweets.Also, the number of factual tweets can be found below the 0.5 Subjectivity line and opinionated tweets are found above it.One disadvantage is that number of neutral tweets is not shown properly as some of the neutral tweets have the same values of subjectivity and polarity,hence they are depicted as coincident points.

PIE-CHART:

This shows the percentage of tweets which are positive,neutral and negative in the form of a pie-chart.

CODE:

PLOT:

BAR GRAPH:

This shows the number of tweets which belong to each of the sentiments.

CODE:

PLOT:

CONCLUSION:

I think TextBlob is a very useful tool and it’s very intuitive to use.The results I got from the tweets were pretty accurate to the actual sentiments, but of course, there are exceptions.

One interesting example that I saw:

“Water cannons and tear gas were used on peaceful protestors”.We humans can recognize that this is a negative sentiment,but the machine sees the word “peaceful” and labels it positive.

All in all,I think performing NLP using TextBlob is brilliant,but there is always scope for improvement.

--

--