Tag Archives: data-visualization

rose-chart

Make a Rose chart in R using ggplot

I got a request to make a rose plot, sometimes called a circumplex or doughnut chart, recently. There are two cases for this kind of
plot. The first is where you are using data that naturally sits in the circumpolar coordintate system. Circular or polar data would fit naturally
in such a chart. The second case is one where you want to take naturally cartesian coordinate data and transform it into the circumpolar
coordinate system. Often this is done simply for visual effect. Regardless, here I will describe how to do this in R (version 3.3.1 bug-in-your-hair)
and ggplot 2 (should work fine in a 2.x version).

Naturally circumpolar data

An example of a natural dataset for such a graph can be seen in this periodic data represented in the rose chart.

Polar Data Plot

Naturally cartesian data

However, most people aren’t dealing with this natural coordinate system. Rather, they are in a traditional cartesian coordinate system – if you don’t know then with a high degree of probability you should assume that your in a basically cartesian space.
But we can still achieve the rose chart for this data. Let’s walk through it with some sample data.

library(ggplot2)
library(plyr)

# generate some random data
set.seed(42)
events <- ceiling(10*runif(10)) 
sales <- 1000*runif(10)

# make a dataframe
df <- data.frame(market=1:10, events = events, sales = sales)

Now we have created some markets each of which have a number of events (1:10) and some sales returns (1:1000).
My dataframe ended up looking like:

Market Events Sales
1 10 457.7418
2 10 719.1123
3 3 934.6722
4 9 255.4288
5 7 462.2928
6 6 940.0145
7 8 978.2264
8 2 117.4874
9 7 474.9971
10 8 560.3327

We can easily create a bar chart that shows this data:

# make the initial bar chart
p1 <- ggplot(df) +
    aes(x=factor(market), y=sales, fill=factor(market)) +
    geom_bar(width=1, stat="identity")

calling p1 will give you your version of this plot:

Bar Chart

You could easily make a similar chart for Events by Market.

Translate to a circumpolar coordinate system

To make the data that we have into a rose plot we are going to wrap that bar chart onto itself.

# now simply want to cast the cartesian coord bar chart onto a circumpolar coord system.
p2 <- p1 + scale_y_continuous(breaks = 0:10) +
    coord_polar() + 
    labs(x = "", y = "") +
    scale_fill_discrete(guide_legend(title='Market')) +
    theme(axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank())

Here we have taken the already existing bar chart, p1, and given it a continuous y scale that corresponds to 10 divisions – 1000 would not add any clarity to the resulting plot. We are simply trying to give a sense of scale in the y-axis.
We then push onto the polar coordintate system with coord_polar(). That’s it. The remaining calls help to clean up our presentation. Remove the x and y axis labels and add a legend for the Market factor color map. Finally, using calls to theme, we remove all of the axis text and ticks to simplify the presentation.
Here is what we end up with:

Rose chart

That’s fine, but we lose all sense of perspective in the actual market values and comparison between markets could perhaps be made simpler. Let’s try to add some perspective. Moving back to our original bar chart, let’s add some grids that give a better sense of scale along the y-axis.

# to achieve a grid that is visible, we will add 
# a variable to the dataframe that we can plot as a separate plot
# This means that we use plyr.ddply to subset the original data,
# grouped by the market column, and add a new "border" column
# that we can then stack in a separate geom_bar

df2 <- ddply(df, .(market), transform, border = rep(1, events))

p1 <- ggplot(df) +
    aes(x=factor(market)) +
    geom_bar(aes(y=events, fill=factor(market)),
             width=1, 
             stat="identity") +
    geom_bar(data=df2,
             aes(y = border, width = 1), 
             position = "stack", 
             stat = "identity", 
             fill = NA, 
             colour = "black")

Firstly, we computed a second dataframe using ddply out of plyr. This took every market row and added a border column that has a 1 for every event in that market. Have a View() of the dataframe and you will see many more rows that df – I have 70 in mine. Each market now has a multitude of rows equal to how many events there were in that market.
We then did the same sort of bar chart as before, but do note that we have flipped to event for the y-axis. I have reversed what we did before so you can try it out for sales on your own.
Crucially, we added a second bar chart to the plot object, which uses the df2 data. It is building that bar chart with the border column data and stacking the results with no fill and black outlines. Your resulting bar chart looks like:

Bar Chart with Grids

Cast our grided barchart to polar coords.

To get a rose chart from this new bar chart is no different to what we did before. All the differences are wrapped up in the generation of p1, so we have kept our code fairly DRY.
Rerunning the generation of p2 with the new p1:

p2 <- p1 + scale_y_continuous(breaks = 0:10) +
    coord_polar() + 
    labs(x = "", y = "") +
    scale_fill_discrete(guide_legend(title='Market')) +
    theme(axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank())

yields:
Rose chart with grids

Most pie charts are bad, so here is a good one.

So, there is a movement in the data science community to kill the pie chart because all pie charts are bad. And much of the criticism is valid. Pie charts can be replaced by better representations, usually starting with bar charts, to convey more accurate and easier to interpret relational comparisons between categories. It’s harder to get away with willfully distorting perspective in a bar chart than it is in a pie chart – always ask to see the percentage values in someone else’s dense pie chart. When it does come into play is a binary comparison. That can be powerful. > 3, use a bar chart.

But with all of that said, I came across a very illuminating pie chart today that I wanted to preserve. pacman pie chart

There is also this actual pie. pie eaten pie chart