Here is the static map, scroll down for the interactive version.
I noticed that someone posted on Facebook how much slot revenue was generated in my hometown of Salem, Illinois and it seemed like a lot. I wanted to see if it really was or not, so I decided to take a look for myself. The Illinois Gaming Board actually makes it easy to pull down the monthly reports, which I was suprised to see. It even gives the option of download a .csv file. Nice.
Gaming revenue is highly dependent on population, or at least it would seem to be. It would make no sense to just plot this revenue without any sort of context associated. So, I needed to find a dataset of the population of cities and towns in the state of Illinois. Wikipedia to the rescue. Using the rvest package, I was able to scrape the Wikipedia table, then do some cleaning on it.
Merging the Datasets
The problem comes in joining the two things together. I already had taken out some of the county level data from the Illinois Gaming database but I didn’t want to plot the slots that are in the counties, but instead just the ones located in cities and towns. A straight merge is going to not make some matches that need to be made. So, I used the fuzzyjoin R package, which allows a user to “fudge” the merge a little bit so that the key vector doesn’t have to be exactly right to actually make a match. Then, I removed any duplicates that occurred.
How to Measure Gambling Activity?
There are a lot of different numbers in the gaming reports, and each offers a different way to measure how much gambling is going on. After thinking about it a lot, I decided that total amount lost is probably not a good approach. Here’s why. I could gamble today with one hundred dollars. I could get some wins and get some losses. Play for 4 hours and walk out with the same hundred dollars. Tomorrow I could start with the same hundred dollars. Gamble for ten minutes, lose it all, and stop. On the first day I gambled probably over a thousand dollars in total, the second day it was just one hundred. Total amount played gets us closer to actual volume. Let’s divide that by the number of people in each town.
Getting my Coordinates
How I need to get the longitude and latitude of each city in the dataset. To make that easier, I added a column with the state name. Then I merged that into a new column which I fed into the geocode command to get a more accurate set of coordinates. Here, I just saved that data and loaded it instead of hitting the Google Maps API every time.
Making a Popup
Finally, I am going to create some variables that will be displayed in a popup whenever the user clicks each pin. This took some work, but I think it provides the user a lot more information that may be useful for them.