Replace your complex bar chart with a dot plot to make it better understood

by Sep 3, 2021UX, Analytics0 comments

A spoon with little dots

Replace your complex bar chart with a dot plot to make it better understood

The alternative to bar charts that works well with complex data

A spoon filled with dots on in the foreground with a pink background
Photo by Sharon McCutcheon on Unsplash

Bar charts are used for nearly every situation, which means sometimes you’re struck by monstrosities like this:

A categorical bar chart with 7 bars representing each category, separated into 8 different strategies. As a result, 56 different colored bars are represented from left to right, with a line chart visible above the bar charts to represent the average.
https://www.smashingmagazine.com/2017/03/understanding-stacked-bar-charts/

Working with data visualizations often means working with bar charts, which can get boring after a while. Work with multi-variable bar charts, and you’ll wish for something easier on the eyes.

Luckily, there’s an alternative to bar charts that accomplishes both of those things. Many people prefer it to bar charts, especially when your data becomes more complex: the Cleveland dot plot.

Understanding dot plots

The dot plot came about as a result of a few different forces at work. The first of which was a concern over the accuracy of visual cues. In the early 1980s, William Cleveland and Robert Mcgill published a piece that talked about the accuracy of certain visual cues.

The accuracy of different visual cues are represented, with the ones to the left being more accurate and the ones to the right being less accurate. From most accurate to least it goes: Position, length, angle, direction, area volume, saturation, and hue.
Representation of the accuracy of different visual cues

Cleveland noted that people are a little better at judging position accurately than they are at measuring length. The other influencing factor at the time was Edward Tufte and his famous “data-ink ratio.”

The text above says Data-Ink Ratio = Data Ink/Total ink used in graphic. Two visualizations are below: the left one has a low data-ink ratio because it’s in 3D, has shadows, and a number of other additional things that would require ink. The right one is a flattened bar chart with a better data-ink ratio. An arrow is pointing going from left to right.
https://towardsdatascience.com/the-power-of-visualization-in-data-science-1995d56e4208

In simple terms, the less ink used for things other than the data, the better. These two ideas, accuracy and less clutter/ink, contributed to what is known as the dot plot.

A dot plot representing sales by store location. It’s shaped like a horizontal bar chart, with Rio de Janeiro being the top value (dot farthest to the right) at $1200k and Jesoro being the bottom value (dot farthest to the left) at $375k. All other locations are between the two.
https://learning.oreilly.com/library/view/the-big-picture/9781260473537/

The dot plot replaces the bars of the bar chart with dots, and it has 3 main advantages:

  • Reducing clutter
  • Being able to have a non-zero base
  • Being able to visualize multiple points easily

All 3 of these advantages work together to provide a better alternative when the data you’re showing begins to get more complex.

Dot plots vs. bar charts

Bar length is a great way of encoding data, but it becomes less valuable when you have to represent more values or categories. For example, if I just showed you this bar chart, you could estimate the two categories and compare them.

Two bars with no values. The purple bar on top is around 5x longer than the grey bar on the bottom.

Even if you didn’t have the exact numbers, you could estimate that the purple bar is 5x longer than the grey one. This is the strength of using length as a visual cue. But when you begin to have many more categories, the advantages of using bar length start to fade away.

When we look at a bar chart with multiple categories, we’re often not paying attention to the individual bar lengths. We’re just paying attention to the endpoint which represents the total value.

A comparison between the top bar chart and the bottom dot plot. The bar chart uses bar lengths for each category (Life expectancy in different countries), while the dot plot has a dot at where the end of each bar length would be, representing the same value.
https://www.infragistics.com/community/blogs/b/tim_brock/posts/bar-charts-versus-dot-plots

As a result, the dot plot and bar chart offer similar value to readers when there’s one data category, leaving it up to personal preference. But what if we wanted to show two variables for a category? This is where the data-ink ratio starts to become an issue, and dot plots begin to shine.

Here are a few ways you might consider use bar charts with two variables.

3 different charts showing how you might represent two variables as bars. The left-most chart is a stacked bar chart with values going to 15,000, the middle chart two bars per category going to up 7,500 representing the two variables, the right-most is two charts, one representing each variable.
https://uc-r.github.io/cleveland-dot-plots

We can begin to see that these alternatives often require either a lot of space or ink. However, look at what happens when we use a dot plot.

The bar lengths have been replaced by dots in the dot plot, with each dot representing the total value of a variable on a line. As a result, there’s a whole lot more space to workt with.
https://uc-r.github.io/cleveland-dot-plots

We not only use a lot less ink but there’s also a whole lot more room to work with.

The dot plot becomes even clearer when you consider another one of its advantages: it doesn’t have to start at the zero baselines, unlike bar charts. This can allow for even greater clarity, especially if the data is grouped around certain points.

A bar chart on top and a dot plot on the bottom representing each countries life expectancy at 3 different points (1990, 2000, and 2012). The bar chart consists of 3 bars per country: red (1990), purple (2000), and 2012 (blue), and the values has been squished together and starting at 0. The bottom dot plot has the axis starting at 55, with 3 dots (red, purple, and blue) representing the same thing. There’s a lot more space to work with on the dot plot.
https://www.infragistics.com/community/blogs/b/tim_brock/posts/bar-charts-versus-dot-plots

The dot plot here is much more readable when the axis is shifted to start at 55 years instead of 0, allowing for easy comparison of variables. We can’t do the same thing with our bar charts without skewing reader perception because the length of the bar is how audiences compare values.

A bar chart starting at a non-zero baseline. The values are skewed immensely: even though Sysco made $29.25 billion, it looks like it made close to 0 at first glance.
Sysco made close to $29.25 billion, but it looks closer to 0 based on bar length

You can even add lines to connect the two plots. We can often use this to compare the difference between the two points.

A dot plot of several different cities, with two variables per row. Lines are connected between the two dots of each row, and a percentage has been annotated to show the percentage difference between the two dots. The title: “Total revenue by City and Gender” shows how certain cities have a gap between revenue brought in between genders.
https://uc-r.github.io/cleveland-dot-plots

What’s the point of dot plots?

Dot plots are a viable alternative to bar charts that uses less data-ink.

Some, including Cleveland himself, argue that dot plots are superior to bar charts. They allow for more accurate interpretation by making labels easier to read, reducing clutter, and allowing more whitespace.

So why aren’t they used more? The main disadvantage of dot plots is that they aren’t as familiar as bar charts. Audiences unfamiliar with them may meet them with confusion or resistance, so you have to consider the specific situation and label and explain the dot plots well. But they offer a very distinct advantage when it comes to categorical data with multiple variables. The clarity you get when shifting away from bar charts is definitely a reason to consider using them.

So if you’ve ever been overwhelmed by color fatigue and data overload with more complex data, consider using dot plots. They might just save you a headache trying to process the data.

Kai Wong is a UX Specialist, Author, and Data Visualization advocate. His latest book, Data Persuasion, talks about learning Data Visualization from a Designer’s perspective and how UX can benefit Data Visualization.