Data Visualization

This guide to data visualization standards and practices is specifically meant for Amino’s data stories. While we do visualize a ton of information in our products and day-to-day work, data storytelling has unique constraints. Having said that, if you want to make your analytics dashboard or presentation look nicer, you’re more than welcome to borrow these concepts. Just keep in mind they don’t always translate well to different media!

Contents

  1. Guiding principles
  2. Colors
  3. Constructing the chart
  4. Examples

Guiding principles

  1. Balance clarity to the layperson with technical precision. Healthcare is scary. Data is scary. We want to make both less scary. Sometimes, the most technically precise explanation (delivered with the best of intentions) is more likely to be misinterpreted by a layperson than a simpler, less technically precise explanation. Amino’s audience is almost entirely laypeople.

  2. Be transparent, truthful, and trustworthy. The information we present in charts must be clearly decipherable and contain elements that engender trust (e.g. axes start at zero wherever possible). They should also not contain any analytical “black boxes” (e.g. internally derived metrics or proprietary language).

  3. Maximize visual accessibility, especially for mobile screens. More than 60% of Amino’s blog traffic comes from mobile. In practice, optimizing for mobile means thin margins, bold fonts, and lots of contrast. We also use a color scheme that is discernible to people with common types of colorblindness.

Colors

Color palettes for data visualization need to be bold, distinct, perceptually accurate, and distinguishable to people with common forms of color blindness. Thankfully, our Amino brand colors check most of these boxes.

Factors/Categorical

Color factors

A few notes on factors: Two factors are commonly used to distinguish patient gender in our data stories. For the sake of consistency, blue designates male and orange designates female. Adding more than five factor colors to a chart is strongly discouraged.

Continuous/Scalar

Context is key for selecting the appropriate color ramp. Generally speaking, we start with purple. The diverging scale is used to make a distinction between low and high values, where blue is typically low and orange is typically high. In some cases, coloring via quantiles creates an easier to understand continuous scale.

Diverging Blue Orange Red Purple
#16b8e0
#dbdbdb
#dbdbdb
#dbdbdb
#dbdbdb
#61bfdf
#ced8dc
#e3d2c9
#e1cfce
#d3cad4
#87c6de
#c2d4dc
#e9cbb8
#e6c1c0
#cbbacd
#a6cddd
#b3d1dd
#eec2a6
#e9b4b3
#c3abc6
#c2d4dc
#a6cddd
#f2b995
#eca8a6
#ba9bbf
#dbdbdb
#95c9de
#f5b184
#ef9999
#b28cb8
#e9cbb8
#87c6de
#f8a972
#f08d8d
#aa7bb1
#f2b995
#73c3df
#faa161
#f17e81
#a16ca9
#f8a972
#61bfdf
#fb974f
#f16f75
#985ca2
#fb974f
#44bbe0
#fc8e3b
#f15f68
#8f4c9b
#fc8626
#16b8e0
#fc8626
#f04d5d
#853b94

The best color scales align hue and luminosity with their respective numerical value. This requires some corrections for perceptual accuracy. For more background information, see Picking a Colour Scale for Scientific Graphics by Doug McNeall. Perceptually accurate, intermediary colors are generated via Chroma.js.


Constructing the chart

To get started, you will need:

  1. R
  2. Sketch, or something similar
  3. A custom ggplot2 theme. Amino’s theme can be found in this private repo.. Check out this guide to learn how to make your own custom themes.

Generate the chart area in R via ggplot2, export as a PNG via ggsave, and drop the resulting file into Sketch. Several Sketch templates are provided in the private repo above.

This process may seem disjointed, but it allows us to create charts in a consistent and easily reproducible manner—playing to the strengths of each platform. If an analysis needs to be updated or tweaked, the chart can easily be refreshed by simply re-running the code in R. Meanwhile, titles, labels, legends, footers, and other stylistic elements can be handled in Sketch.

Layout & aesthetic guidelines for Amino’s charts

Dimensions: Default chart dimensions are 1080 by 1080 pixels. These dimensions were chosen as a sort of compromise between the wildly inconsistent image sizes used across social media platforms. Depending on the content being presented, height can be altered to suit, but width must be fixed.

Header: With few exceptions, title and subtitle must not exceed 3 lines. Either the title is one line and the subtitle is two lines, or vice versa. This is to keep the language neat and concise. Title is black and 40pt, subtitle is 54% transparent black and 37pt.

Legends and callouts: Text is black 25pt, which is the minimum readable size on mobile.

Chart area: The chart background color is #EEEEEE to provide contrast against white backgrounds used on Amino’s blog, social media, and most other sites.

Footer: The footer contains data source information, tagline, logo, and copyright. Text is 54% transparent black and 16pt. Data source information plus tagline must not exceed 2 lines. These can be combined as a single paragraph to accommodate. Footer background is #DBDBDB.

To create clear, understandable, and visually accessible charts, its best to think in layers. In order from bottom to top:

  1. Base layer. The background color #EEEEEE
  2. Axis grids and text. A darker gray #DBDBDB
  3. Axis labels and zero lines. Both black; axis text is 25pt bold, zero lines have weight of 3px.
  4. The data. Either colored, or light gray.
  5. Highlighted data. Emphasized with a stroke of 2px.
  6. Legends. Pure black text, 25pt font.
  7. Callouts. Pure black text, 25pt font. Also can include arrows and lines with a stroke weight of 2px.

Data viz example 1

This example contains nearly all of the concepts mentioned above.


Examples

Below is an evolving list of examples with commentary and links to their respective blog posts.

Categorical data

Which axis you put your categorical variables on is an aesthetic choice. In this case, a long list of states is easier to read vertically vs. horizontally. The categorical axis grid is offset to create a table-like effect.

Data viz example 2

Source: Announcing nationwide imaging cost estimates on Amino

Time-series data

Line charts are most commonly used to display time-series data. Generally speaking, lines should be used for rate functions, and bars should be used for discrete values.

Data viz example 6

Source: What data from 205 million private health insurance claims reveals about America’s opioid crisis

The top chart shows deaths per 100,000 residents as a line (rate function) while the bottom chart shows patients as bars (discrete values). This is also an example of a combination chart: one title and two subtitles, tied together with narrative callouts.

However, line and bar charts aren’t the only way to show how data changes over time. In the example below, a heat map was employed to show when women received a specific type of ultrasound over the course of 308,000 pregnancies. Bands in gray represent weeks where only 100-200 patients were observed as receiving an ultrasound, while the darkest purple bands represent 10,000+ patient observations.

Data viz example 3

Source: What to expect when you’re expecting pregnancy ultrasounds

Geographical data

The hexagonal tile map is used when trying to compare states by a scalar variable (like cost, population, or age) and the precise geographical location of each state is not important.

Data viz example 5

Source: Here’s how much women could pay for preventive care under the AHCA

Regular maps are used to display county-level data or similar levels of detail.

Data viz example 7

Source: What data from 205 million private health insurance claims reveals about America’s opioid crisis

Relationships

A chord diagram is a useful way to visualize relationships in a matrix of data. In this example, the referring doctor is connected to the rendering doctor by a line representing the volume of referrals made that year—the thicker the line, the more referrals. This particular visualization was generated in Gephi and imported into Sketch as an SVG.

Data viz example 4

Source: Data on 211 million referrals shows how doctors really work together