The more I play with R the more I love it and what it can do.  Better still is CRAN which is the repository of packages to add more functionality.

Okay we’re not going to create the new Radion6 or Repknight but boy it’s a lot of fun. Well, okay, *I* think it’s a lot of fun, I can’t speak for you wonderful lot.  

For a much more detailed version of below you might want to look at Jeffrey Breen’s post on Airline consumer sentiment.

Getting the Basics down with R 

Load the twitteR package for reading tweets.  If you don’t have the package installed you can install it through R with install.packages(‘twitteR’, dependencies=T)

> library(twitteR)

Load the plyr package.

> library(plyr)

Pull some tweets in to R.

> retail.tweets = searchTwitter(“#retail”, n=1500)

Load a positive word list.

> hu.iu.pos = scan(‘/work/repos/lexicon/positive-words.txt’, what=’character’, comment.char=’;’)

Load a negative word list.

> hu.iu.neg = scan(‘/work/repos/lexicon/negative-words.txt’, what=’character’, comment.char=’;’)

Load the script that does the magic, it’s pretty short all in all. And you can look at it here.

> source(‘sentiment.r’)

Create a positive word list.

> pos.words = c(hu.iu.pos)

Create a negative word list

> neg.words = c(hu.iu.neg)

Pull all the text from the tweets.

> retail.text = laply(retail.tweets, function(t) t$getText())

Work out a sentiment score for each tweet.

> retail.score = score.sentiment(retail.text, pos.words, neg.words)

Display a graph of the sentiment scores.

> hist(retail.score$score)

Retailscore

Where to go from here?

From my point of view there’s an obvious web element that’s missing.  So there needs to be some sort of bridge from the web to R running in the background.  The rJava package acts as a handy gateway between the two.

The plots could do with some tinkering to make them look colourful. There’s a ton of decent plot libraries out there for R that extend the basic range.


 

Advertisements