What I’m looking for is a relationship between liberal states and better health insurance through the exchanges. There are many different ways to quantify “good” health insurance, but I will use two measures.
Maximum Out of Pocket Costs for a Family
Monthly Premiums
First I need to subset my data for reasons that will become clear in just a second. I want to only use the 2014 data.
After doing a quick exploration I see that this dataset has dental insurance mixed in with health insurance. I want to drop those dental insurance plans.
Now, I find the variable I am looking for: TEHBInnTier1FamilyMOOP. However the data is not very clean. For example:
There is a dollar sign in there as well as a common. Both of those characters will make it impossible to convert this to a numeric vector.
Much better. Let’s visualize the range of values.
The scale is very bimodal. Very right and left censored. I see that the max value is 12700 and that over 4000 plans have that as their max out of pocket. After doing some research I find that $12,700 is the maximum value for MOOP in plans available through the ACA. Makes sense.
Now, I want to find out what the average MOOP is for each state that is contained in the dataset.
Nice. Nothing looks like it’s out of place. Just want to rename my columns.
Now, I need to find a measure of ideology. Thankfully, Richard Fording has a dataset that contains a score for each state in the United States. The latest scores are for the year 2014, that’s why I only used that year in my earlier subsetting. There wasn’t a really good way to do this using R, so I just created the vector by hand.
Higher values is more liberal and lower values is more conservative. South Carolina has a score of 0. The most conservative state.
Full data avaialble here: https://rcfording.wordpress.com/state-ideology-data/
I want to use a nice theme for my visuals. I copied some code I found on Github that did a similar type of visualization using state abbreviations as markers.
And now onto a visualization. I’m going to throw a regression line on the visualization just to give a sense of relationship.
Looks like more liberal states actually have HIGHER overall MOOPs and more conservative states have LOWER MOOPs. However the relationship isn’t statistically significant.
I want to do the same thing for premiums. However, I need to load a new dataset.
I need to do some subsetting to stay with 2014 as well as get rid of premiums that are clearly outliers.
Now I’m going to do something very similar to my previous analysis. Using the aggregate command to find the mean premium for each state.
Looks good. Now I will merge this new dataframe with the previous one that I constructed.
Perfect. Let’s do another visualization.
Here the relationship is a negative one. More liberal states do have lower monthly health insurance premiums, however this relationship isn’t statistically signficant, either.