It's All a Bet

Jason Bell – Author, Advisor and Practitioner in Machine Learning and Artificial Intelligence

Over 475 machine learning citations and 50+ patent citations on machine learning.

Twitter sentiment analysis in 30 seconds.

The more I play with R the more I love it and what it can do.  Better still is CRAN which is the repository of packages to add more functionality.

Okay we’re not going to create the new Radion6 or Repknight but boy it’s a lot of fun. Well, okay, *I* think it’s a lot of fun, I can’t speak for you wonderful lot.  

For a much more detailed version of below you might want to look at Jeffrey Breen’s post on Airline consumer sentiment.

Getting the Basics down with R 

Load the twitteR package for reading tweets.  If you don’t have the package installed you can install it through R with install.packages(‘twitteR’, dependencies=T)

> library(twitteR)

Load the plyr package.

> library(plyr)

Pull some tweets in to R.

> retail.tweets = searchTwitter(“#retail”, n=1500)

Load a positive word list.

> hu.iu.pos = scan(‘/work/repos/lexicon/positive-words.txt’, what=’character’, comment.char=’;’)

Load a negative word list.

> hu.iu.neg = scan(‘/work/repos/lexicon/negative-words.txt’, what=’character’, comment.char=’;’)

Load the script that does the magic, it’s pretty short all in all. And you can look at it here.

> source(‘sentiment.r’)

Create a positive word list.

> pos.words = c(hu.iu.pos)

Create a negative word list

> neg.words = c(hu.iu.neg)

Pull all the text from the tweets.

> retail.text = laply(retail.tweets, function(t) t$getText())

Work out a sentiment score for each tweet.

> retail.score = score.sentiment(retail.text, pos.words, neg.words)

Display a graph of the sentiment scores.

> hist(retail.score$score)

Retailscore

Where to go from here?

From my point of view there’s an obvious web element that’s missing.  So there needs to be some sort of bridge from the web to R running in the background.  The rJava package acts as a handy gateway between the two.

The plots could do with some tinkering to make them look colourful. There’s a ton of decent plot libraries out there for R that extend the basic range.


 

2 responses to “Twitter sentiment analysis in 30 seconds.”

  1. […] of the most read blog posts I’ve done was “Twitter sentiment analysis in 30 seconds” and it’s still one of the most used search […]

  2. […] I know what you’re thinking, I’ve done all this before. Well I have in part. Twitter sentiment analysis in 30 Seconds (done in R), the Raspberry Pi Twitter Sentiment Server (in R and Python).  And yes I’ve done […]

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.