Less Software, More Design

MAKING GOOD VIZ

Essential design strategies for improving your visualizations

Street art in Detroit, Michigan. Photo by the author.

I teach students who are new to building their skills in working with data. The prevailing myth is that good data visualizations come from (so-called) good software. Excel has perpetuated this myth by enabling users to quickly convert raw data into 3D bar charts and pie charts with dozens of slices — or worse yet, 3D pie charts. Bad visualizations are not a problem with the software but the user’s design choices.

My motivation for writing this article is to encourage new learners, especially my students, to achieve compelling data visualizations by thoughtfully and intentionally applying design principles. I created the following infographic using Google Sheets. As you know, the data visualization community does not regard Google Sheets as “professional” visualization software, but this graphic is effective for its intended purpose and audience.

Simple infographic created in Google Sheets. Graphic by the author.

Software is neither a necessary nor sufficient condition for creating compelling visualizations. People created some of the most impactful visualizations before computers existed. Claiming that a given software is better than another is no different than arguing that pens are better than pencils. Or hammers are better than screwdrivers. You will develop faster and go farther by focusing less on software and more on problem-solving and design.

This article reviews essential design strategies that you can quickly and immediately apply to your data visualizations. My review is not intended to be comprehensive. Instead, this article serves as a guidepost for learning essential technical and conceptual design principles.

The biggest mistake that new learners make when building data visualizations is designing for themselves. Color choices and typefaces are selected based on personal preferences. Annotations, if used at all, are littered with abbreviations and jargon. New learners arrange graphical elements in a way meaningful to themselves, not the audience.

Make all of your design decisions from the standpoint of your audience. If you don’t know your audience, you may not be ready to visualize your data. Of course, you can use visualizations to explore your data, but I’m focused on telling data stories. After orienting yourself to your audience, you can use some of the following questions to help guide your design thinking.

  • What does your audience already know about the topic?
  • What does your audience need to know to understand the visualization?
  • Does your audience have specific information needs? What are those needs?
  • Is a data visualization the best way to communicate the data? Visualizations are not inherently superior to tables. Be clear why you are choosing a visualization.
  • What are the information needs of the audience?
  • How will the visualization be viewed? (Electronic vs. hard copy?)
  • How will audience characteristics influence interpretations? Think about culture, language, education, data literacy skills, etc.

Relying on software defaults is a sure-fire way to create an awful visualization. The following figures show the default settings for different software environments. How many charts have you seen with these color palettes, typefaces, font sizes, and grid lines? The data visualization community should ban default settings in the same way we have come to loathe 3D bar charts and pie charts.

Iris data set visualized with three different software environments. Graphic by the author, unfortunately.

If you are new to building data visualizations, go ahead and take a few minutes to enjoy a graphic you built with intention (as opposed to accidental clicking) — just for a few minutes, though. The next big task is customization. Learning the different elements that comprise a graphic can make you more efficient in your work. For example, in Google Sheets, you can double-click on any chart elements to bring up its customization panel. These are the elements I adjusted to customize my marijuana graphic.

Image by the author.

Josef Müller-Brockmann (1914–1996) was an influential figure in graphic design, and his body of work continues to inspire and influence designers around the world. He popularized grid systems, which is an essential aid for organizing graphical elements in visual communication.

Image from WikiCommons.

Müller-Brockmann writes:

The grid system is an aid, not a guarantee. It permits a number of possible uses and each designer can look for a solution appropriate to his personal style. But one must learn how to use the grid; it is an art that requires practice.

I suggest avoiding using software tools for constructing a grid. Start simple. Learn about the conceptual principles of the grid system and sketch layouts by hand. I can quickly iterate on ideas by sketching different forms.

Image by the author.

Entire books are devoted to the topic of grid systems, so my suggestions here are just a starting point for your further development.

Think about how your audience will consume the information. Where do your eyes start, and how do they move through the visualization. People read from left-to-right, and top-to-bottom, which makes the upper-left region of your chart the most valuable real estate. This place is where people enter the visualization. Be sure to apply the same thinking to all graphical elements. For example, consider the following bar charts from the Iris data set. The graphics are the same.

Graphics created using the Iris data set. Image by the author.

In this example, I optimized the left chart by giving a horizontal orientation to the bars since this is how people read — left-to-right and top-to-bottom. The user’s eyes have to travel over the entire graphic and then read from the bottom to the top. This task is relatively easy with three bars but is extremely difficult when many bars are displayed.

What people typically call a font is a typeface. Your typeface selection affects the readability and tone of visualizations. If you want your visualization to look like a middle-school art project, then by all means, use Comic Sans — and throw in some 3D bars to make the graphic look interesting. Actually, please don’t do this. Ever.

A graphic using the Iris data set that should not have been created by the author. But it was.

Spend time studying different typefaces and learning about the problems intended to solve. Make informed decisions to ensure your graphics are readable and the tone is appropriate for the audience and story. The following image is an example of an assignment submission from one of my first graphic design courses on typography. The assignment involved giving a brief history and use of a typeface. I selected Avenir, which I learned is exceptionally versatile for creating a crisp, clean look.

Image by the author.

Making color choices is one of the most challenging problems to solve when constructing a data visualization. Avoid selecting colors to make your visualization look interesting. What you think is attractive may conflict with your audience’s information needs and can complicate or distort the story you are trying to tell.

Avoid stereotypical thinking when encoding dimensional values with color. Using blue and pink to represent gender is an overly simplistic way of problem-solving, especially with our understanding of gender identities. Encoding race values can be incredibly challenging and often requires a different visual channel to avoid stereotypical approaches or conflicting graphical cues. And, of course, you have to think about accessibility issues and how to ensure the segment of your audience who is colorblind can still interpret the graphic.

Ishara test for color blindness, displaying number 74. Image from Wikipedia.

Unfortunately, I cannot summarize this vast body of information in a single paragraph or a short article. So, my recommendation is to add this as a topic to study. Aseem Kashyap has an excellent article for getting started.

An effective visualization should stand alone to the extent possible. The stand-alone concept means that end-users have all the information in plain sight to understand what you are displaying. They should not have to search the Internet for more information or take guesses at what your abbreviations or jargon means. Spend time thinking about titles or headings for your graphic to help drive the narrative of your story. Much of my work is academic research, so my titles tend to be quite descriptive. But, sometimes, for my consulting work, I will use the headers to drive home a critical take-away point.

You can use text to create a visual hierarchy, but that requires some amount of contrast. For example, titles should be distinguishable from body text. You can achieve contrast through font pairing. Esther Teo has a great Medium article on this topic:

Whether creating tables or graphical displays, do not ignore the importance of sorting or ordering values. Most graphical systems will sort the dimensions in alphabetical order when you display dimensions. That might be useful but may not be the correct information that your user needs.

When you become proficient at creating highly customized graphics, you will encounter situations where your tools (i.e., software) do not perform how you want them to perform. For example, I have some rusty R skills but can still create a customized graphic using ggplot. But, sometimes, getting the annotations just right can be enormously time-consuming. When I encounter this problem, I make a PDF of the image and finish my customization in Adobe Illustrator.

If you want to do this kind of post-processing, save your graphics as a vector image (e.g., PDF and SVG) rather than a raster image (e.g., JPEG and PNG). You can directly access and customize all the graphical elements with a vector file, including text. You can manipulate raster images like a photo, but this is cumbersome. Whenever you do any post-processing, be sure you are careful not to distort the scale of the data. As a side note, you cannot just print a JPEG or PNG to a PDF and expect to access the elements.

This advice follows the work of Edward Tufte — one of the most influential people in the field of data visualization. Examine every graphical element and ask yourself whether it has information value. Does it encode information? Does it organize information? Offer visual cues for moving through or consuming information? Using the Titanic data set, I created a simple bar chart in Google Sheets, illustrating a minimalist approach like Tufte bars.

Graphic produced with Google Sheets using the Titanic data set. Graphic by the author.

Of course, I could have easily added a sinking ship to the graphic, but that doesn’t help my story. A ship submerging in water doesn’t convey my data story. Many of the default graphical elements were unnecessary. For example, I provided the exact values on the bars, so the reader didn’t have to scan back and forth to make comparisons. By doing so, the y-axis was no longer necessary. I couldn’t figure out how to eliminate the y-axis, so my hack was to camouflage it in white. I added color to reinforce gender comparisons visually, but there is no need for a legend since the titles give that information.

Avoid adding graphical elements to make the visualization look interesting. The data visualization community refers to unnecessary graphic elements as chart junk. Again, everything might be clear in your mind, but the visualization is not for you — it is for your audience. The story from your data should be the point of interest. If you add graphical elements for appeal, be sure that they do not distract your user. Again, what is clear in your mind might not be clear in their mind.

I’m turning all my course notes into shareable articles. Most of my articles are tailored for new learners. Feel free to follow me if you want to receive updates on new publications.