In this post I will step through how to integrate geo-tagged tweets into the choropleth plots I created in the previous post, “Census 2010 I”. I will also show how you can use the
plotly package to make the plot interactive and enable a hover-over effect to display the tweet content.
Getting some tweet data
Not all tweets have geolocation information available. Accessing the Twitter API via the
streamR package particular parameters can be set to only include those tweets with geolocation enabled. You can also specify a bounding box to further filter the geographic area from which you would like to draw your sample tweets. Now, the bounding box is going to be just that, a box. And we are dealing with polygons. To isolate tweets from a specific geo-political region, such as a census tract, you can use the
sp package. For the details on how to both get geo-tagged tweets in R and how to clip the tweets to fit a specific spatial object polygon, refer to my previous post Access Twitter posts by country.
Picking up from last time
Our starting point, therefore, looks like this:
p.roadmap are the plots and
speakers.total are the variables I created in the last post based on the American Fact Finder data to visualize the level of Spanish speakers by census tract.
Here I will be working with a small set of tweets collected from Twitter and clipped to only include posts that emanated from within Pima county, Arizona –the county in which Tucson resides. So here’s a quick look at the variables in the data:
There is plenty of interesting information you can play around with –but note, fields with user input often contain unreliable information. In this post I’ll only need a few key features (
text) and include one other (
lang) which facilitates my aim to explore the relationship between language choice on Twitter and US Census demographic information.
To include points on our map corresponding to Twitter posts we use the
geom_point function including specifying the
There are various aesthetics that
ggplot2 makes available that we can use to visualize language (
lang). In this case I don’t want to see languages other than English and Spanish so I will subset the data using
es and map it to the
color aesthetic. Note that I’m naively trusting the language detection algorithm that Twitter uses.
If you’re me, you’re thinking it would be cool to see what the content of these tweets are. The
plotly package can be hooked up with
ggplot2 and you can get a really cool effect in which the text appears on hovering over a point on the map.
Just load the
plotly library, create your standard plot, and then apply the
This plot has only scratched the surface. There is a lot more to learn about plot.ly. I encourage you to head on over to their website and check out the growing documentation on the R API.