Make a Rose chart in R using ggplot
I got a request to make a rose plot, sometimes called a circumplex or doughnut chart, recently. There are two cases for this kind of
plot. The first is where you are using data that naturally sits in the circumpolar coordintate system. Circular or polar data would fit naturally
in such a chart. The second case is one where you want to take naturally cartesian coordinate data and transform it into the circumpolar
coordinate system. Often this is done simply for visual effect. Regardless, here I will describe how to do this in R (version 3.3.1 bug-in-your-hair)
and ggplot 2 (should work fine in a 2.x version).
Naturally circumpolar data
An example of a natural dataset for such a graph can be seen in this periodic data represented in the rose chart.
Naturally cartesian data
However, most people aren’t dealing with this natural coordinate system. Rather, they are in a traditional cartesian coordinate system – if you don’t know then with a high degree of probability you should assume that your in a basically cartesian space.
But we can still achieve the rose chart for this data. Let’s walk through it with some sample data.
library(ggplot2) library(plyr) # generate some random data set.seed(42) events <- ceiling(10*runif(10)) sales <- 1000*runif(10) # make a dataframe df <- data.frame(market=1:10, events = events, sales = sales)
Now we have created some markets each of which have a number of events (1:10) and some sales returns (1:1000).
My dataframe ended up looking like:
We can easily create a bar chart that shows this data:
# make the initial bar chart p1 <- ggplot(df) + aes(x=factor(market), y=sales, fill=factor(market)) + geom_bar(width=1, stat="identity")
p1 will give you your version of this plot:
You could easily make a similar chart for Events by Market.
Translate to a circumpolar coordinate system
To make the data that we have into a rose plot we are going to wrap that bar chart onto itself.
# now simply want to cast the cartesian coord bar chart onto a circumpolar coord system. p2 <- p1 + scale_y_continuous(breaks = 0:10) + coord_polar() + labs(x = "", y = "") + scale_fill_discrete(guide_legend(title='Market')) + theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank())
Here we have taken the already existing bar chart,
p1, and given it a continuous y scale that corresponds to 10 divisions – 1000 would not add any clarity to the resulting plot. We are simply trying to give a sense of scale in the y-axis.
We then push onto the polar coordintate system with
coord_polar(). That’s it. The remaining calls help to clean up our presentation. Remove the x and y axis labels and add a legend for the Market factor color map. Finally, using calls to
theme, we remove all of the axis text and ticks to simplify the presentation.
Here is what we end up with:
That’s fine, but we lose all sense of perspective in the actual market values and comparison between markets could perhaps be made simpler. Let’s try to add some perspective. Moving back to our original bar chart, let’s add some grids that give a better sense of scale along the y-axis.
# to achieve a grid that is visible, we will add # a variable to the dataframe that we can plot as a separate plot # This means that we use plyr.ddply to subset the original data, # grouped by the market column, and add a new "border" column # that we can then stack in a separate geom_bar df2 <- ddply(df, .(market), transform, border = rep(1, events)) p1 <- ggplot(df) + aes(x=factor(market)) + geom_bar(aes(y=events, fill=factor(market)), width=1, stat="identity") + geom_bar(data=df2, aes(y = border, width = 1), position = "stack", stat = "identity", fill = NA, colour = "black")
Firstly, we computed a second dataframe using
ddply out of
plyr. This took every market row and added a
border column that has a
1 for every event in that market. Have a
View() of the dataframe and you will see many more rows that
df – I have 70 in mine. Each market now has a multitude of rows equal to how many events there were in that market.
We then did the same sort of bar chart as before, but do note that we have flipped to
event for the y-axis. I have reversed what we did before so you can try it out for
sales on your own.
Crucially, we added a second bar chart to the plot object, which uses the
df2 data. It is building that bar chart with the
border column data and stacking the results with no fill and black outlines. Your resulting bar chart looks like:
Cast our grided barchart to polar coords.
To get a rose chart from this new bar chart is no different to what we did before. All the differences are wrapped up in the generation of
p1, so we have kept our code fairly DRY.
Rerunning the generation of
p2 with the new
p2 <- p1 + scale_y_continuous(breaks = 0:10) + coord_polar() + labs(x = "", y = "") + scale_fill_discrete(guide_legend(title='Market')) + theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank())