I’m finding R more and more useful for just dragging data out of things. RSS data is a touchy subject with some, I still use it a lot and built Curatic to get me the stories I want to read, not lists of stories I don’t. Anyway, that’s not the point of this post.

R and XPath, good friends. Pull and RSS feed and get the titles, descriptions and the publication dates quickly.

> library(XML) 
> library(RCurl) 
> xml.url<-"https://dataissexy.wordpress.com/feed/" 
> rssdoc <- xmlParse(getURL(xml.url)) 
> rsstitle <- xpathSApply(rssdoc, '//item/title', xmlValue) 
> rssdesc <- xpathSApply(rssdoc, '//item/description', xmlValue) 
> rssdate <- xpathSApply(rssdoc, '//item/pubDate', xmlValue)

Done šŸ™‚

 

Advertisements