D

Defying Chart Design Rules for Clearer Data Insights

Climate change is one of the most pressing issues facing humanity in the 21st century. As is the case with Covid-19, politics, and other prominent topics that occupy public consciousness, the fields of climate change and sustainable energy have their fair share of popular but inadequate graphics that, for example,  awkwardly show relationships between variables, or that sacrifice precision for eye candy, or that simply fail to translate numbers into an effective, clear, visual takeaway. The right balance between appeal (“look at this”) and importance (“so what?”) is an enduring chart design challenge. Effective charts not only seek to inform, but also to move, influence, and inspire change. 

Sir David MacKay—the late physicist, computer scientist, scientific advisor to the UK government, and one of the great science communicators—created one of the most well thought-out, insightful, and evidence-based climate-related graphics. The chart which he originally created in 2009 addresses the heart of the climate issue whilst providing food for thought and revealing feasible paths of action. And yet it defies a number of “conventional” visualization guidelines. Digging into his graphic is instructive, and it provides us with logic aids that can be used when breaking visualization rules. 

MacKay’s original graphic (below) can be found in his acclaimed book Sustainable Energy – Without the Hot Air, and in his TED Talk from 2012.  He has taken significant effort to share and distribute several versions of this graphic, along with suggested captions and data sources.

A highly complex graphic where the x-axis is population density (people per km^2) and the y-axis is energy consumption per person (kWh/d/p) Both axis are onlog scales. The countries are shown as colored bubbles that are sized according to land area. Diagonal lines show power consumption per unit of area.
David MacKay’s Map of the World: Power consumption per person versus population density, in 2005. Point size is proportional to land area and colors are based on region: olive green is North America / Australia; turquoise is Europe; emerald green is North Africa; blue is Sub-Saharan Africa; red is Asia; black is South America; and pink is Central America. The diagonal lines are contours of power consumption per unit area. Source: Sir David MacKay

From a data viz theorist standpoint, at first glance, there seem to be several questionable choices. 

Information density. This graphic uses a colored bubble chart which captures four variables at once (three numerical and one categorical). In addition, we have green and light pink diagonal lines plus thinner green line segments (or “tails”) for some of the bubbles. Though this data-ink ratio may indulge Edward Tufte acolytes, it appears to have strayed into information overload territory.

Logarithmic axes. These are so often used to distort and mislead that their mere presence sows the seed of distrust. While they do have a place in scientific fields or with context-aware audiences, this graphic was created to communicate to the general public! If the information density argument was not sufficient to relegate this graphic to the irrelevant archives, the log axes have certainly destroyed any hope of interpretability. Right?

Choice of axis variables and units. The units used on both axes are metrics that have been normalized: the x-axis is population density in people per square kilometer, while the y-axis is energy consumption per person in kilowatt-hours per day per person. But both include the “person” unit—does this keep the variables independent? If so, why have “person” in the first place? Further, could the pink contours, which show yet another unit (W/m2), represent a fifth dimension? Is this perhaps a function of the two dimensions from the axes? Is it an overlay of a completely different chart to allow comparison between the darker green diagonals with the rest of the chart? 

Choice of chart. The other question that arises is what scale is used to define the size of the bubbles. Since there is no legend on the visual, the creator has the freedom to define it. Another well-documented issue is the limitation of human perception in correctly gauging the differences when comparing areas and the best practice of having the measure be proportional to the bubble area does not circumvent this issue. The disdain that accompanies graphics that force area comparisons has ensured that the proper use of pre-attentive attributes is featured prominently in visualization commandments.

Takeaways. There doesn’t appear to be a clear trend or takeaway message. A good reason to use a bubble chart (or a scatter plot for that matter) is to find out some sort of trend or correlation from the four depicted variables. Here the data do not appear to be correlated in any way, nor are there any apparent clusters formed from the same color bubbles. 

How does all of this come together? Surely, whatever the story is, couldn’t it be broken down into multiple charts, each being significantly easier to absorb? Finally, is there even a message? And do we care? Is the graphic even redeemable? 

Intentional chart design choices

It turns out that not only is the graphic redeemable but there is a great thought that has gone into it with excellent explanations for each one of these design choices that will become apparent below. 

Logarithmic axes. Let’s start with the choice of axes. MacKay put two highly relatable metrics on the two axes—namely population density and energy use per capita—two factors that obviously impact energy demand. When multiplied, these two metrics are equal to the power per unit area (W/m2). A measure which is also represented with the diagonal contour lines. This is true only because MacKay used log-log axes. All points along each diagonal contour correspond to a constant power per unit area. 

The primary quantitative differentiator in the visual is the relative position of the bubbles to the pink diagonals, and not relative to the axes. So, using position relative to this derived diagonal axis, we see that Brazil and Canada have about the same consumption in terms of W/m2, and so do China and India. 

Choice of axis variables and units. Now, why is W/m2 a relevant metric in the first place? It isn’t obvious at the start, but this is the key insight that MacKay offers—namely, that land areas are a primary factor when thinking about sustainable solutions. Energy crops, wind farms, and solar farms often occupy vast swathes of land, so it seems worthwhile to wonder just how much land they may require to produce a unit of power, i.e., what is their energy density (W/m2)? This is what is depicted by the green diagonals.

A version of the graphic with just just one selected green line showing energy crops in W/m2.
A version of the graphic with just just one selected green line showing energy crops in W/m2. Source: Sir David MacKay

Takeaways. So, both the diagonal green lines as sources of energy (supply) and the bubbles as countries (demand) appear in units of power per unit area (energy density), and this allows us to compare supply and demand. Brilliant!

The key takeaways are then apparent. At a macro level, if we want to rely predominantly on renewables, we need to be thinking of country-sized solutions. There are also takeaways on the individual country level—for the UK, even if the entire country was devoted to energy crops, it would not satisfy its energy demand. For Singapore, it may make sense to import renewable power from larger bubbles of the same color, i.e., neighbors or countries on the same continent that have larger landmass. In one fell swoop, MacKay shifts the attention from the problem (how much are we emitting?) to solutions and next steps—issues of land use, what technology could work where, and the need for international collaboration.

One interesting feature worth noting is that the “tails” attached to some bubbles signify their development over the previous 15 years, and they often move up and to the right as they become more developed—a trend that is  also better portrayed with two separate axes.

A zoomed in look at Algeria, Sudan, and Brazil—countries that have lines protruding from their bubbles. These lines are directionals, showing movement over 15 years.
A zoomed in look at Algeria, Sudan, and Brazil—countries that have directional “tails” to show movement over 15 years. Source: Sir David MacKay

Information density and choice of chart. Another reason that log-log axes can be forgiven here is because the underlying data for the countries exhibits a high dynamic range, i.e., the data span multiple orders of magnitude. Also, here we are not looking for a trendline, which would be another red flag considering Mar’s Law: “Everything is linear if plotted on a log-log scale with a fat magic marker.” Transforming the coordinates still shows a fairly scattered plot, with countries more or less well-spread out at four extremes (Canada, Singapore, Bangladesh, Sudan). 

In MacKay’s graphic, the area size of the bubbles is literally proportional to country land areas! There is no information lost through the chosen encoding. The size indicates the size of potential solutions, or the land area potentially available to deploy sustainable energy generation solutions. 

The bubble colors are also thoughtfully used and serve a primary purpose of geographical vicinity (specifically, continent) and a secondary purpose of economic status (poorer Asia in red and Sub-Saharan Africa in blue versus richer North America and Australia in olive green and Europe in turquoise). 

Could this visual contend to be one of the greatest statistical graphics ever created? I certainly think so. This graphic easily rivals Minard’s portrayal of Napoleon’s march to Moscow, in terms of richness in information density and is, in contrast, a modern, more mathematically elegant portrayal. 

It also tackles a key issue for humanity in the 21st century whilst seamlessly tying together global economics, demographics, natural resource and both physical and technological constraints. The graphic provides a new (land area-based!) perspective on the reality of demands and limitations on supply and makes the audience ponder how one might actually approach the climate problem. 

MacKay also seems to have been perfectly aware of the significance of his chart, which he called “David MacKay’s Map of the World.” MacKay’s presentation of the graphic in his TED talk, in particular, is one of those rare combinations of stunning scientific insight in a stunning communication performance. All while seeming to violate visualization gospel.  

Satya Amaran Headshot
Satya Amaran

Satya Amaran is the R&D leader for Operations Research, a team within the Machine Learning, Optimization, and Statistics capability, part of Dow’s Core R&D organization. At Dow, he teaches an internal short course on effective data visualization in addition to his primary job responsibilities. Satya earned his bachelor’s degree in chemical engineering from the National Institute of Technology in Karnataka, India, followed by an M.S. and Ph.D. in chemical engineering from Carnegie Mellon University.