Capture Charts 📈

1 From insights to impact

The aim of this document is to serve as a valuable resource for everyone at SHARE Creative or Capture Intelligence, enabling them to understand the commonly used visualisations created for client projects. It is important to appreciate when and how to use these visualisations, as well as how to interpret them.

The BBC has provided an excellent summary of this topic in their own graphics cookbook:

A reference manual, rather than a tutorial, it might not tell you how to make your very first chart in R, but is a useful collection of little tips and tricks.

In line with this, this document will not teach you R or the technical aspects of creating these plots. Within this resource, code is not provided in the hope of avoiding boilerplate and to try and make this document as coding language/software-agnostic as possible. Instead, its primary goal is to help you grasp the purpose and appropriate usage of visualisations in our work. By doing so, it empowers everyone to confidently explain the key insights, strengths, and weaknesses conveyed by different types of plots. Once the building blocks of this are in place, it is much easier to apply this knowledge to different coding languages and software.

The document begins by introducing various types of visualisations, each serving a specific purpose based on the type of task you want to perform. This approach is adopted because there are often multiple ways to present the same data, and each visualisation has its own merits and drawbacks. Within this section, you will find examples of the business questions we commonly encounter, accompanied by appropriate visualisations and explanations for interpreting them.

Subsequently, there is a section dedicated to enhancing the overall aesthetics of plots. These tips not only contribute to visually pleasing outputs but also assist in effective storytelling. Moreover, they introduce fundamental principles of data visualisation that we should all be mindful of.

2 Specific Visualisation Purposes

2.1 Trend over time

One of the fundamental purposes of data visualisation is to gain insights into how a particular metric or variable has evolved over time. This can be achieved through the use of line charts, which provide a straightforward way of representing trends.

When analysing trends over time, we often focus on metrics such as volume, proportion, or percentage. By plotting these values on a line chart, we can easily observe their fluctuations and patterns.

2.1.1 Example

Figure 2.1: We can use line charts to see daily trends, for example

2.1.2 Intepretation

Interpreting a line chart is relatively simple, but it is crucial to pay attention to the axes in order to fully understand the information being presented. For instance, we need to determine whether the y-axis represents raw values or proportions, and whether the x-axis represents years, months, or another unit of time.

It’s important to note that when creating trend plots, we typically need to group the data into intervals that are more easily digestible. This is especially true when working with data from Sprinklr, as the “Created Time” field is recorded down to the nearest second, and it’s unlikely that multiple posts are made at the exact same second. The choice of time intervals can significantly impact the interpretation of the plot. Selecting an inappropriate interval may obscure trends or exaggerate their significance.

In the example above, we binned the data on a daily basis. However, if we examine the plot below, which represents the data at a weekly scale, we can see how the view of volume fluctuations becomes less detailed.

Figure 2.2: We can visualise a different date interval, visualising a weekly trend in this case

It’s worth noting that we can also present the same data using a bar chart instead of a line chart. The interpretation remains the same, but bar charts may not be as effective at displaying trends. This is because the bars are discrete rather than continuous, and they emphasise the magnitude of individual intervals rather than the overall trend.

Therefore, bar charts can be thought of as more useful when the absolute value of a data point is relevant for understanding a trend, or when we want to supplement an existing line chart with additional information (for example, representing volume alongside the proportion already plotted with time).

Figure 2.3: Trends can also be visualised using bars, though are more difficult to see smaller changes in the trend

2.2 Comparing categories

It is often necessary to compare different groups or categories in relation of a specific metric or variable. For instance, we may want to examine the volume of posts for different topics. The crucial aspect here is that these variables we are measuring represent distinct groups and may not follow a logical order (e.g., Platform). In such cases, a bar chart is commonly used and a popular way to go.

2.2.1 Example

Figure 2.4: Bar chart showing the number of messages per category

2.2.1.1 Intepretation

Similar to the aforementioned trend charts, interpreting a classic bar chart is straightforward. However, there are key modifications we can make to enhance the clarity of the plot and facilitate storytelling.

One such modification involves reordering the categories on the x-axis. We may choose to arrange them alphabetically if that order has been used previously in the project. Alternatively, it can be useful to display the bars in descending order based on the y-axis value, in this case, “Number of posts.”

Reordering the categories on the x axis by the value we are displaying can help improve plot clarity and is a useful storytelling technique

Figure 2.5: Reordering the categories on the x axis by the value we are displaying can help improve plot clarity and is a useful storytelling technique

A significant challenge with bar plots arises when there are numerous categories. If these categories result in bars of similar lengths, a phenomenon known as the Moiré effect can occur. This effect leads to confusion and difficulty in focusing, making it challenging to distinguish between individual bars and accurately interpret the data.

Figure 2.6: A visual example of the Moiré effect with many similar bars

We observe an example of the Moiré effect in the bar chart above. It manifests when we start perceiving “patterns” that are not actually present due to sensory overload. Furthermore, charts with more than approximately eight bars, even if they vary in size, can appear cluttered and overwhelming.

2.2.1.1.1 Lollipop Charts

In order to address this challenge, we can employ lollipop charts to present the data in a different way:

Figure 2.7: Lollipop charts utilise whitespace better, sacrificing value precision for an all round more visually appealing chart

The lollipop chart offers a clearer visual representation that may be (hopefully is) more appealing to you. These charts can also serve as excellent alternatives to bar charts in general, particularly if you find yourself using bar charts excessively in a particular deck. Lollipop charts make better use of white space, avoiding the clutter often associated with busy bar charts.

One important consideration when comparing lollipop charts to bar charts is the level of importance attached to knowing the exact value of each category. In bar charts, it is evident where the top of the bar is located, but with lollipop charts, determining whether the centre, top, or bottom of the circle represents the value in question can be somewhat imprecise. To address this, it is possible to include the raw values directly on the lollipop chart. Despite this, it remains essential to consider the intended audience and the key message we aim to convey, whether it pertains to general trends or specific values.

2.2.1.2 Pie Charts

While pie charts are commonly used and provide a sense of familiarity, they are rarely the optimal choice for representing our data. Nevertheless, we often receive requests to include pie charts when comparing categories.

Pie charts are most effective when we are comparing part-to-whole values, and not comparing category to category. Furthermore, they are also useful when values are approximately 25%, 50%, or 75%. These percentages are easier to interpret within a pie chart compared to a stacked bar chart.

Figure 2.8: Pie charts are better than a stacked charts when the sum of individual parts add up to a meaningful whole

Pie charts are not suitable when we want clients to be able to accurately compare the size of different segments- i.e when we want to just compare category size to category size.

Figure 2.9: Bar charts are more appropriate when we want to accurately compare segment size

This limitation can be somewhat mitigated by directly adding values onto the plot, but we will delve into that aspect later in this document

2.2.1.3 Doughnut Charts

The sibling of the Pie Chart, a doughnut chart displays the same information as a pie chart though the centre is removed. This allows information to be reported within the centre of the chart itself. However, whereas the proportion of slices represents the values of interest in pie charts, doughnut charts use the length of each arc to represent the values being presented. The same precautions should be kept in mind when making and presenting doughnut chart as mentioned above for pie charts (in fact, if anything, the number of categories that can appropriately be displayed is lower- up to 5).

Figure 2.10: Comparison between a Pie Chart and a Doughnut Chart. Both charts present the same data

2.2.1.4 Treemap

An additional method of comparing part-to-whole relationships is through treemaps. These represent different categories as rectangles, where the sizes of the rectangles represent the proportions or volumes of different categories within a larger category. The total plot volume (i.e. the rectangle made up of all categories together) represents the total volume of the category of interest (in the case below, total mentions in 5 different platforms). We can compare the sizes of the rectangles to understand the relative volumes or proportions of the different categories- larger rectangles represent topics with higher volumes or proportions, while smaller rectangles represent topics with lower volumes or proportions. Note the use of a sequential colour scale (see the colour section below) to help guide the eye to the underlying values.

A treemap showing part-to-whole relationship of social media mentions within the whole conversation using the size and colour of rectangles to encode values

Figure 2.11: A treemap showing part-to-whole relationship of social media mentions within the whole conversation using the size and colour of rectangles to encode values

2.3 Comparing sub-categories within categories

A stacked bar chart can be a valuable tool for gaining deeper insights beyond a simple bar chart. It allows us to understand the relative composition of each bar (i.e. category) by considering the levels of a second categorical variable nested within.

In the context of our marketing company, stacked bar charts can help us answer important business questions. For example, we can use them to visualise how sentiment varies between different topics or brands, or any other categorical variable of interest.

A key business question that may be answered with this is to visualise how sentiment varies between between topics or brands, or any other categorical variable of interest.

2.3.1 Example

Figure 2.12: Stacked bar chart show how larger categories are divided into smaler categories

2.3.2 Interpretation

Interpreting stacked bar charts requires careful consideration, especially when presenting them to clients who may not be familiar with this type of visualisation. It may not be immediately clear to clients whether each stack starts from the same baseline, such as the purple stack in our case (which starts at 0), or if it starts from the top of the stack below (as it does in our case).

Comparing stacks between different categories can also be challenging. While it’s evident which category contains the largest number of negative posts in the baseline stack (purple), it becomes more difficult to compare the neutral and positive stacks since they don’t share the same baseline (ask yourself, are there more neutral posts in category C or D- it’s not clear is it?). Being mindful of this limitation is crucial. Finally, stacked bar charts can be particularly challenging to interpret when the total bar size varies significantly. In such cases, the individual stacks can become squashed, making it harder to discern the relative proportions accurately.

Figure 2.13: Stacked bar chart may be more interpretable when the y axis is normalised to a percentage or a proportion

To address the potential challenges mentioned above, we can scale our data to show proportions or percentages of posts with each sentiment per category, rather than raw volumes. This approach makes it much easier to compare the sentiment make-up of categories with lower volume. However, it’s essential to note that scaling the data in this way may lead clients to overlook the fact that each bar represents vastly different volumes in reality.

To navigate this potential pitfall, it is crucial to provide clear explanations of what the chart represents, whether it shows volume or proportion, and to determine a specific narrative or story that the visualisation aims to convey (such as comparing sentiments between categories or exploring sentiment variations within a specific category).

2.4 Visualising the distribution of a continuous variable

Sometimes we will want to understand the distribution of a continuous variable in our data to gain insights. For instance, if we have calculated a score of sentiment ranging from 0 (negative) to 1 (positive) for each post in our dataset, we might want to examine the distribution. Although these types of visualisations are typically used for exploratory data analysis (EDA) rather than client presentations, it is still an extremely powerful and useful string to have in your data exploration bow.

2.4.1 Example

Figure 2.14: Histograms can showcase the distrbution of a continuous variable in our data

2.4.2 Interpretation

Histograms are graphical representations that consist of bars, with each bar representing a specific range of values for the variable we are visualising (e.g., the x-axis of Sentiment Score). When interpreting a histogram, it is essential to consider the shape, centre, and spread of the distribution. The shape can be symmetric (normal), skewed to the left or right, or exhibit bimodal characteristics (two peaks). The centre can be determined using the mean or median, while the spread can be represented by the standard deviation or range. Histograms offer several benefits, such as identifying outliers, understanding data range and variability, and comparing variable distributions across different groups. Similar to our “trends over time” plots, it is important to select an appropriate bin width or value range for each bar when creating a histogram, as this choice can influence the interpretation of the distribution.

To provide clarity, let’s examine histograms depicting a left-skewed, right-skewed, and bimodal distribution:

$Histograms revealing different underlying distributions within data$

Figure 2.15: Histograms revealing different underlying distributions within data

From these examples, our interpretation should be that the left-skewed distribution appears as a curve skewed to the left, with most values clustered towards the upper end of our scale (closer to 1). On the other hand, the right-skewed distribution appears as a curve skewed to the right, with most values clustered towards the lower end (closer to 0). Lastly, the bimodal distribution displays two distinct peaks, with most values concentrated around two different means (0.3 and 0.7 in this particular example).

2.5 Visualising most frequent terms

The objective of a bigram network is to provide a comprehensive overview of a specific conversation by identifying frequently occurring pairs of words.

N.B. The term bi-gram refers to a sequence of two adjacent terms. If we were to examine three adjacent terms, this would be referred to as a tr-igram, and so on. In other words, a bigram specficially refers to an n-gram where n = 2. The plot we present is known as a bigram network, which displays these pairs of words. Whilst it may seem like splitting hairs, it is important to be precise in our terminology.

2.5.1 Example

Figure 2.16: Bigram network showcasing frequently appearing bigram within a dataset

2.5.2 Interpretation

Interpreting a bigram network is more complex compared to the simpler plots we have previously discussed. Each word is represented by both a label and a node (circle), with edges (arrows) between terms representing the direction a bigram should be read. The colour of the nodes represents the frequency of the individual terms, while the colour of the edges represents the frequency of the bigrams. Additionally, the size of the nodes corresponds to the frequency of the individual terms. This size distinction can serve as an initial guide to identify frequently occurring words. Finally, it is important to note that the physical placement of the nodes holds no significance; their arrangement is determined by an algorithm based on their connections. For instance, the bigram “join us” is no more similar to “every day” than it is to “annual Hispanic”.

For a more detailed overview of bigrams and more specific business-specific interpretation, please see the ParseR vignette

2.6 Differences in language between categories

Often we will be interested in comparing language use between groups/categories. These categories could be audiences, posts of differing sentiment, posts from different quarters etc.If we can classify different posts into distinct groups, we can compare language.

2.6.1 Example

Figure 2.17: Weighted Log Odds plots show how word usage changes between categories within our data

2.6.2 Interpretation

Before getting bogged down by the statistical interpretation of WLO, let’s think about this from a data visualisation point of view. WLO charts are essentially scatter plots that depict word frequency on the x-axis and the log odds ratio on the y-axis. Each point on the scatter plot represents a word, with labels applied to identify them. Things get a little trickier when we take a look at the x axis and realise it is on a logarithmic scale- meaning the distance between 1 and 10 on this scale would be the same as between 10 and 100 (although WLO x-axes typically don’t start at 0 for clarity).

For a more comprehensive understanding of WLO and its statistical interpretation, please refer to the ParseR vignette

Despite the complexity involved, it’s crucial to emphasize that the magnitude of a WLO value indicates the strength of the association but is not directly interpretable as a probability or frequency. Instead, it represents the logarithmic difference between two probabilities (or odds) and should be treated as a relative measure of association.

Therefore, when reporting WLO to clients, one must refrain from using phrases such as “This term is X times as likely to appear in Category A than Category B and C”. Instead use phrases such as “This term has a stronger association with Category A than Category B and C”.

2.6.3 Network visualisation of language use across categories

There is another way of visualising differences in language across categories that has been developed internally.

We can plot a network where the largest nodes symbolise the categorical variables in our data, with terms radiating from these nodes based on whether they are relevant to that specific category. With these plots, terms that are only relevant to a single category will only be connected to a single category and appear towards the edge of the plot. In the example below, these are terms such as “today”, “festival”, and “culture”. Terms that are relevant to multiple categories, however, will appear towards the the centre, such as “amazing”, “hispanic”, and “celebrate”. As there are no labelled axes in this plot, we cannot determine whether two terms that are both connected to the same node are more or less associated to that category than each other. Note that whilst the general interpretation of this plot will be consistent throughout its use, how we decide which terms can appear in the plot will change the specific interpretation. For example, the below plot shows the most frequent terms across two categories, however we could also show the terms with the highest log-odds ratio for difference categories. This would effectively recreate a standard WLO plot but visualise it as a network. Because there are no numerical axes, it might actually be an easier way to visualise that certain terms have a stronger association with certain categories than others than a standard WLO in some cases.

Figure 2.18: A network showing overlapping and distinct terms across different categories

2.7 Visualising complex conversational landscapes

UMAP (Uniform Manifold Approximation and Projection) is a powerful data visualization technique that can help analysts gain insights into complex datasets, such as the conversational landscape of social media mentions.

We are often tasked with capturing a wide array of conversations on various topics. However, analysing and making sense of this vast amount of data can be challenging. UMAP provides a way to visualize and explore this conversational landscape, enabling us to identify clusters, patterns, and relationships within the data.

2.7.1 Example

Figure 2.19: UMAP with different clusters mapped to colour

2.7.2 Interpretation

Note

This is a very high level overview of UMAP. As explained previously, the purpose of this document is not to get a thorough understanding of statistics and Data Science, but rather empower solid data visualisation principles and understanding from an interpretation perspective

When applying UMAP to social media mentions, the technique projects the data into a lower-dimensional space, typically two dimensions, while preserving the underlying structure of the conversations. Each data point represents a post, and their proximity to each other on the UMAP plot indicates similarity in the context or content of the conversations (i.e. semantic similarity).

The plot above is an example of this two-dimensional ‘landscape’ that is produced by UMAP. We can see that it is effectively a scatter plot, though the interpretation is much more complex that usual scatters. UMAP reveals clusters or groups of mentions that are closer together on the plot. These clusters represent conversations that share common themes, topics, or sentiments, and are represented by colour in the plot above. We can examine these clusters to identify distinct subtopics or communities within the larger conversational landscape. By exploring the content of these clusters, we can gain a deeper understanding of the prevailing narratives or discussions taking place on social media. Another way of visualising these clusters is by producing a faceted plot, as below, where each of our clusters appears separately as a small multiple on a single plot. This enables us to rapidly compare clusters in different parts of the data and see how similar or different they are.

Figure 2.20: UMAP presented faceted by a grouping variable

UMAP visualizations also highlight outliers or mentions that are far apart from the main clusters. These outliers may represent unique or less common aspects of the conversation, such as alternative perspectives or rare events, or they might represent irrelevant posts which have snuck though our data cleaning steps.

The density of points in certain areas of the UMAP plot can indicate the volume of conversations or the prominence of specific themes. Areas with dense clusters may represent highly active or popular discussions, while sparser regions could indicate less popular or peripheral topics. To identify density on a UMAP, it is mostly appropriate to plot using a monochromatic colour palette, as otherwise our eye gets distracted by colour:

Figure 2.21: UMAP presented with single colour

Finally, the distance between clusters can provide insights into the relationships or connections between different themes or conversations. Closer clusters may indicate related or overlapping discussions, while distant clusters suggest more distinct or separate topics.

2.7.2.1 A word of warning for using UMAP visualisations in decks

Most of the time ‘visualising the landscape’ in a UMAP is something that is interesting for us as researchers, but is too complex and unnecessary for client deliveries. Often the client is interested in:
1. The number of clusters
2. The size of the clusters in this case, a Table summarising these two values for each cluster is more suitable and digestible.
If presenting a UMAP like Figure 2.19 (a UMAP coloured by cluster) or a faceted UMAP like Figure 2.20, we must be confident that we can explain each of the clusters and/or they directly aid in understanding. Colours should only be used if a thorough clustering approach has been taken, with the name of each cluster clearly labelled on the plot. Often this approach could be relevant for thorough topic modelling or mapping colour to sentiment for example.
If we have not performed a clustering approach directly to the UMAP (i.e. the data is not labelled with a grouping variable), but we still want to present the landscape in this way, use the single colour approach (Figure 2.21). This is best when we just want to show the landscape at a very high level and point out regions of high density without confusing the client with the addition of colours.

2.8 Comparing one or more categorical variables with a metric of interest

Heatmaps are powerful visualisations that use colours to help us identify patterns in the value of a metric for one or two categorical variables. They offer great versatility and can be employed for various purposes, all aimed at addressing specific business questions by examining colour intensity in different areas of the heatmap.

For instance, heatmaps can be utilized to determine the topics in which certain brands are mentioned more frequently, or to identify specific time periods when discussions are more intense.

2.8.1 Example

Figure 2.22: Example heatmap showing branded conversation within different topics

2.8.2 Interpretation

The heatmap displayed above illustrates the distribution of brand mentions across different topics. Each row represents a topic, and if we were to sum the values in each cell, the total for each row would equal 100%, while this wouldn’t hold true for each column. Check for yourself below with the values added to each cell:

Figure 2.23: The sum of the rows = 100% in this case

The x-axis represents different brands, while the y-axis represents different topics. Each coloured cell represents the percentage of conversation within each topic that includes mentions of a specific brand.

Heatmaps are particularly useful for providing a generalized view of the data rather than an overly precise representation. Consequently, during interpretation, it is more appropriate to observe general patterns rather than focusing on specific values. For example, we can observe that Brand A is highly popular (darker colours) and is frequently mentioned in conversations for all topics except Topic 4. Additionally, Brand E stands out prominently in Topic 4, dominating the branded conversation within that topic (darker colour). Conversely, Brands D and F appear with much lower percentages in all topics (indicated by lighter colours in the Brand D column).

To further illustrate the interpretation of heatmaps, let’s consider the following example. The heatmap shown here effectively functions as a calendar, with each row representing a different day and each column representing an hour of the day. We can interpret this plot as depicting the proportion of branded conversations occurring at different times throughout the week. Immediately, two clear patterns emerge. Firstly, for each day, the majority of users posting about this brand do so between 16:00 and 21:00 (indicated by darker colours when reading rows from left to right). Secondly, when comparing the days, we observe that Saturday and Sunday have more posts at most hours than weekdays (indicated by darker coluors when reading columns up and down).

Figure 2.24: Example heatmap showing hourly social media brand mentions for each day of the week

2.9 Showcasing inter-relationships between categories

Sometimes we might want to visualise how different categories share information/data. The information displayed can be flow (i.e. directed from one category to another) or connection (i.e shared between categories). A business question that might require such a visualisation is understanding how customers have moved between a brand or product and its competitors.

We can use a chord diagram to showcase this, where flow or connections between categories are represented by arcs between categories positioned in a circular layout.

2.9.1 Example

Figure 2.25: Example chord diagram showcasing the flow of information/data between categories

2.9.2 Interpretation

Each category is represented by a segment on the outer part of the circular layout. Then, arcs are drawn between categories to depict the information flow or shared connections between categories. The size of the arc is proportional to the size of the flow. The presence of an arrow at the end of an arc indicates the flow of data, with the arrowhead indicating the direction of the flow. In the given example, the arrows represent consumers transitioning from one Brand to another. We would interpret this as saying that over 160 customers have moved from Brand A (the total width of the arrows leaving the “Brand A” segment), with ~40 customers moving towards Brand A (the total width of the arrows pointing towards the “Brand A” segment). Another insight we might take from this chart is that almost half of Brand B’s new customers have come from Brand A (the purple arrow pointing towards the “Brand B” segment).

When working with chord diagrams, the following factors should be taken into account:

Minimize the number of arc crossings by arranging the categories around the circle strategically. This improves the legibility and clarity of the plot.
Avoid overcrowding the diagram by considering the relevance of smaller connections. If certain connections are of lesser significance, they can be excluded from the plot, accompanied by a note such as “Values less than N are excluded for visual clarity.”
Provide a comprehensive explanation of the plot to the client. While chord diagrams are visually appealing, they can be challenging to understand. Breaking down each segment separately can help guide the client through the data journey and ensure a thorough understanding. ## Presenting precise numerical values

2.10 Presenting precise numerical values

Sometimes we need to provide a comprehensive view of the data and allow the reader to examine the exact values of our analyses and make detailed comparisons themselves. Therefore, if precision is crucial for your analysis and you want to provide specific values or facilitate precise calculations, a table may be more appropriate. Tables can also be suitable when we want to either summarise a lot of information that would require multiple charts over multiple slides.

The use cases for tables are almost infinite- this makes sense as every chart we make is based on a table which we could, unwisely, present in lieu of a chart. But let’s start with a simple example:

2.10.1 Interpretation

There isn’t too much to explain in interpreting tables. Instead, we will focus on some good tips when designing tables.

Tables become more readable the fewer columns they have. Careful consideration of the most important information necessary to portray will improve legibility.
Try to reduce the width of columns to help legibility. This includes reducing the column heading length but also using abbreviations or appropriate precision of values (e.g. report 3.14 rather than 3.141592)
Sort the table by something appropriate. Whilst it might appear that sorting alphabetically is suitable, it rarely brings the most import data to the top. Often we might want to sort by volume, sentiment, or another value that supports the killer take-home message. We can see what our table now looks like if sorted by Var_1:

We can apply charts to tables! Yes, yes, I know this section was supposed to be about tables not charts, but we can add simple charts known as sparklines to show clear trends within tables. These charts are obviously not very detailed as they need to sit in a small space, but can provide a nice compliment to the other values presented in the table.

Add colour to the table. We can apply our heatmap principles from the section above to colour our table similar to a heatmap. This can either be done on a column by column basis (with different colours per columns), or by across multiple columns if they show the same variable (e.g. sentiment percentage in January [Column 1], February [Column 2], March (Column 3]). Bear in mind that there will be no legend, so it should be clear what the colours present - high numbers should be darker.

3 Chart Aesthetic Tips

The following are good data viz practices that should be kept in mind whenever we make a chart. Clients may request charts that go against these practices and principles, but it is important to be aware of such principles in our quest for making beautiful looking Capture Intelligence plots.

3.1 Colours

The choice of colours is one of the most important aspect of any good visualisation, with incorrect usage turning a slick visualisation to a hot uninterpretable mess of sadness.

The Capture Intelligence colour palette is based on the viridis collection of palettes- a series of colour palettes designed to improve graphic readability for those with common forms of colour blindness and/or colour vision deficiency. Plus these colours are super pretty.

Despite this, we are often tasked with using colour palettes that match the client we performing the work for. This section will not inform how to create your own specific colour palette, nor will it go into detailed colour theory, but rather aims to empower you to be able to make appropriate decisions on the best colours to chose for different visualisations.

Broadly, there are three different types of colour palettes one can use to display different types of data:

Qualitative
Diverging
Sequential

3.1.1 Qualitative colour palettes

These palettes are best used to represent values of distinct categories that do not have an intrinsic order. As such, they are appropriate for line charts, bar charts, pie charts, doughnut charts.

These colours are different hues (i.e. different colours), and are sometimes called unordered colour scales. In these scales, no colour is worth more or less than any other colour.

The default Microsoft colour palette we use is an example of a qualitative (or discrete) colour palette:

Figure 3.1: Example qualitative colours - taken from the Microsoft colour palette

3.1.2 Sequential colour palettes

These palettes use multiple shade variations- effectively going from a light shade to a dark shade. They are suitable for representing numbers that go from low to high. This means that a reader can see a value represented by a “light colour” and inherently understand that this represents a lower value than a “darker colour”, without even having to look at a legend yet.

Whilst you can use only one colour (e.g. light purple to dark purple), using multiple colours (light yellow to dark purple) increases the colour contrast and makes it easier to distinguish between values.

Figure 3.2: Example sequential colours. These are powerful as they vary both in hue and saturation

Note

Whilst the colour palette above matches the Capture Intelligence colour palette, and is a sequential palette, there are also colour palettes that not only use multiple colours but also change the lightness/brightness of the colour too. This can be thought of as effectively changing the colour from an intense colour to something more muted. Sometimes palettes which change brightness along with colour can be more effective in showcasing the benefits of certain data visualisation principles applied to different plots. As such, we also introduce the below sequential colour scheme in this section for the purpose of exaggerating the pros and cons of different palettes.

Figure 3.3: Example sequential colours. These are powerful as they vary both in hue and saturation

3.1.3 Diverging colour palettes

These palettes are best used when we want to represent a scale around a central value (i.e. a meaningful middle value such as zero, an average, a threshold, a target etc). Whereas sequential colour palettes go from low to high, diverging palettes utilise a neutral colour in the middle of the scale, with two opposite colours with varying shades diverging from this central value. These palettes are often use to visualise negative and positive values or Likert scales. There are two big advantages to using diverging scales: they emphasize the extremes, and they let readers see more differences in the data.

An example of this could be scores of valence than range from 1 (positive) to -1 (negative) with a central value of 0 (neutral).

Figure 3.4: Example divergent colours - used to represent a scale around a central value

As you can see, the difference between sequential and divergent palettes is very nuanced (especially with many data values), and deciding between the two should be a considered choice. If you want to emphasise the highest values, use a sequential scale, if you want to emphasise the lowest and highest values, use a diverging scale.

To show why these different palettes are important, let’s see them in some example plots:

Figure 3.5: When to use qualitative colour palette

Here we see how a simple bar chart can look vastly different when using different palettes. Qualitative colour palettes maximise the distinction between categories making it easy to different groups at a glance. Similarly, the use of a qualitative palette ensure the colours do not imply any inherent ordering or hierarchy. The sequential colour palette unintentionally convey a perceived hierarchy or sequence that doesn’t exist and subtle differences in colour shades can make it difficult for viewers to distinguish between different categories. Applying a divergent palette to unrelated categories can create a false sense of order or relationship between them. The stark contrast in colours may also draw attention away from the actual values being compared, leading to misinterpretation or confusion.

Despite this, sometimes using a sequential palette can be okay for such plots when we want to emphasise an underlying order. Remember when we said that we could rearrange a chart so the categorical values follow the order of the variable of interest? In this case, using a sequential colour palette actually helps to double-encode the value of “number of posts” by both position and colour

Figure 3.6: We can use a sequential palette when we have already ordered categories based on the value of interest

Here we can see that to be honest the sequential bar chart with the sequential palette is easier to read than the colourful and overwhelming qualitative palette.

Below is another example, this time visualising a heatmap. We can see the value we are visualising with colour (percentage) has a clear order. Using our qualitative palette, which has no inherent order, produces a lego-like mess where it is not clear what each colour represents. The divergent palette is slightly better, but still creates the impression there is a significant distinction between 10% and 40%, when in fact they form a continuous range. The best approach here is the sequential palette that provides a smooth progression of colours to represent the increasing percentages. This ensures a more accurate representation of the data and help users perceive the gradual change in values without introducing unnecessary confusion or bias.

Figure 3.7: When to use sequential colour palette

Despite being “text” rather than “numbers, something like a Likert scale has inherent order to it and hence a qualitative colour palette is unsuitable for this particular visualisation. Whilst this palette enables easy differentiation between groups at a glance, it fails to represent the underlying scale or intensity of the Likert scale. The sequential colour palette is also unsuitable. Its subtle differences in colour shades can make it challenging for viewers to distinguish between the different categories accurately. The sequential palette may inadvertently suggest a progression or intensity within the Likert scale, misleading the interpretation of the data. In contrast, the divergent colour palette can be considered suitable for the Likert scale stacked bar charts. By using a divergent palette, meaningful thresholds or midpoints within the scale can be highlighted effectively. It enables the representation of both positive and negative values, accentuating the contrast between categories.

Figure 3.8: When to use divergent colour palette

3.2 Adding values to plot directly

Sometimes we might want a really clean looking plot, or fully transparency of the exact value being visualised is paramount.

In this case, also including the raw values directly onto a figure can be extremely useful. Here we take the same bar chart we introduced earlier in the document but include the specific values that each of our bars represents as a label. Because we include these values, we can also remove our y axis as well as any plot gridlines as they no longer help us discern more information from the plot.

Figure 3.9: We can directly add bar values to the chart

As mentioned in the pie chart section, adding labels is highly recommended for pie charts and doughnut charts:

It's extremly difficult to estimate quantity from angles or segments, so adding values directly to pie and doughnut charts is recommended

Figure 3.10: It’s extremly difficult to estimate quantity from angles or segments, so adding values directly to pie and doughnut charts is recommended

3.3 Adding labels to the plot directly

Similarly, it is often good practice to add a category label directly to the plot too to avoid having a legend. This is because in general legends take too long to read (ones eyes have to go back and forth between legend and plot), they don’t work great with many colours, and decrease accessibility of our plots.

For example, let’s see a plot where we have a legend:

Figure 3.11: Line chart showing daily trend of three categories with a legend

Notice how you have to keep looking between the plot and the legend to fully understand which line represents which category?

When we plot the category label directly on the figure, the reader no longer needs to zig zag their eyes back and forth between the lines and the legend.

Adding the label of each category directly to the plot reduces the need for the reader to keep moving concentration from the main elements of the chart (the lines representing values) and the legend

Figure 3.12: Adding the label of each category directly to the plot reduces the need for the reader to keep moving concentration from the main elements of the chart (the lines representing values) and the legend

3.4 Gridlines and borders

We should aim to remove clutter from plots to strive for clean and elegant plots. A popular mantra you may see in the world of data visualisation is to “maximise the data-ink ratio, within reason”. This proposes a minimalistic approach to data visualisation by removing most parts of a plot which do not display the data itself. This is certainly one extreme of data visualisation aesthetics, and we should look to find a a middle balance between including too much and too little ‘non-data ink’ (anything on a chart that doesn’t display the actual data).

The advice surrounding gridlines is:

Gridlines that run perpendicular to the variable of interest are the most useful
We do not need gridlines that go from the x axis on a bar chart where the x axis is a category (the bars themselves guide our eye the same way a gridline would).

Figure 3.13: Remove x axis gridlines when displaying categorical bdata

When the data values are specifically displayed on the chart (as in Figure 3.9) then we do not need gridlines to help interpret these values.

Figure 3.14: Remove y axis gridlines too when the value of the bar is represented by a label

Gridlines should not be the same colour as the axis lines or font. A light grey balances utility and overpowering the plot.

Figure 3.15: The difference gridline colours make

The advice surrounding borders:

If the plot you are making is faceted (i.e. made of up of lots of little plots as in the case of WLO), then a border should be included around each plot to clearly show which data and information in contained in each individual plot.
If the plot is stand alone, but requires axes, then only the axes of interest (normally x and y axes) should be drawn, with no outer border.

Figure 3.16: Don’t use borders for stand alone plots

Plots that do not contain any axes (e.g. a bigram) should not have a border around them.

Despite this, consistency is the name of the game here. If for whatever reason a client asks for a border to be around a plot (or a border removed), all similar plots in the deck should follow the same aesthetic.

3.5 Highlighting important data

There are many reasons why we might want to draw attention to significant information within a chart. Highlighting specific data elements can help us emphasize key findings, convey essential messages or insights, and enable viewers to discern patterns, differences, or relationships more effectively. By highlighting particular data points, we facilitate easy comparison and contrast between different groups, categories, or time periods, while also providing additional context and information.

Figure 3.17: Plots with multiple categories can look busy, overwhelming and be difficult for the audience to know where to concentrate

Imagine we are visualising some metric across different brands, such as in the plots above. We know we have unordered discrete categories, so we use our qualitative colour palette. However, the resulting visual can appear cluttered, making it difficult to determine which elements demand our focus. For instance, are we more interested in Brand A or Brand D?

To address this issue, we actually want to make grey the most common colour of our chart! By assigning grey to less important elements, we create a contrast that makes our highlighted colours stand out even more. To align with our narrative, we can highlight the key element in each plot that corresponds to the most important brand. This approach significantly enhances clarity and enables viewers to quickly identify the relevant data points related to the key brand under consideration.

By highlighting a key element of the chart, it clearly enables the audience to know what is the important category in our data-driven narrative and the key take home message

Figure 3.18: By highlighting a key element of the chart, it clearly enables the audience to know what is the important category in our data-driven narrative and the key take home message