Dataviz Horror Stories: Jorge Camões

Leia este artigo em portugues aqui.

Have you ever made an embarrassing mistake in a data visualization? Did you utterly fail to meet your client’s or boss’s vision? Did you proudly share a visualization only to be overwhelmed by critical feedback? Of course you have! We all have, whether we’re beginners or superstars. In this series, we’re encouraging dataviz practitioners to share their own Horror Stories as a way of normalizing “failure” as part of professional development. Brave enough to share your personal tale of woe? Email us at nightingale@datavisualizationsociety.org.

Today’s submission comes from Jorge Camões who, in a sleepless night long ago, came up with this peregrine idea that you can make good charts in Excel, and even wrote about it in his book Data at Work.

I have a closet full of data visualization skeletons. Choose any random mistake, and I’ll find an example. From 3D effects to broken scales to more subtle mistakes, they are all there. But I can blame many of those on my (then) youth and happily move on. I make better mistakes now (I think). Let me tell you about a recent one.

The data

Like everybody else in this field, I like turning a data table into a visual experience. Who’s not excited by the smell of a new data table? Paraphrasing Forrest Gump…

My mom always said a data table was like a box of chocolates. You never know what you’re gonna get.

So when a client shared sensor data with me and told me about their goals, my mind was already in high rotation, imagining all the great stuff I would come up with.

Sensor data is a beautiful, continuous flow of raw and dirty data. There are comforting patterns, inconvenient values, and outliers that sound all the alarms. It’s a kid’s candy shop, really. When there are dozens or hundreds of sensors, you need to find a way to monitor them at a glance, which was one of the client’s goals.

The presentation

The solution was pretty obvious to me: a clean grid of small multiples, layered alarms, a sidebar with filters and sorting keys. I was happy with the draft and got even happier when I realized that the client was not familiar with the concept of “small multiples.” They truly loved them and immediately recognized the potential for monitoring.

But then…

“This is not what we asked for,” they said. “It’s nice to have and helps reduce overhead in some processes, but the key piece is missing.” I had no place to hide.

What they really wanted vs. what I heard

The client had to repeat what my selective perception had filtered out the first time. I can’t go into much detail, but let me show you:

They wanted to display a time series of sensor data to evaluate patterns and detect specific points in the curve with special significance. See the red dots marking the inflection points after each peak? That’s what they wanted to see.

Problem is, although a trained eye knows what to look for and can spot the right point in the curve, this is not scalable and can’t be done with hundreds or thousands of sensors. The client didn’t know how to turn their perception into an algorithm to identify those data points. That’s what they were expecting from me, more than pretty charts.

Happy ending

Unlike the clean example above, the real data was often messy, and some other variables made automatic detection of the inflection points a bit tricky. Still, in the end, I came up with a process that detected those points accurately most of the time, while flagging bad data, so everyone was pretty happy with it.

Lessons learned

This happened just before the Covid-19 pandemic made us all more aware that data quality and subject-matter expertise matter a lot more than we often are willing to acknowledge, especially in our social media bubbles. I’m not a designer, just an ordinary Excel guy. I work a lot on the data side, so I’m fully aware of the interplay between the data and the visuals and how to come up with designs that satisfy clients’ needs—or so I thought.

If you have some experience, visualizing data effectively is not that hard. You likely know all about Gestalt laws and preattentive processing and how to use Ben Shneiderman’s information-seeking mantra (“overview first, zoom and filter, then details on demand”) to help structure the whole thing. That is all fine and dandy, but if you start visualizing the solution in your mind while the client is still explaining the data to you, you risk missing the point. The client doesn’t love data visualization as much as you do. They just bought into the idea that visualizing their data can help to solve their specific problems. For them, it’s a means to an end. If you do a good job, all those subtle design choices you’re so proud of will be natural, obvious, and invisible to them.

Finally, the data you should visualize is often not there or needs profound transformations (in the example above, there are multiple intermediate steps to go from the raw data to the red dots). These calculations often make no sense to a subject-matter expert, so you need to ask what they think (again, the Covid-19 pandemic provides many examples of this, like statistical modeling without epidemiological input). The inverse is also true—that subject-matter experts can benefit from your expertise with data—so communication and collaboration are essential.

All in all, this “mistake” turned into an excellent experience for me. I learned a lot about the data, got the opportunity to apply fairly sophisticated data transformations and modeling to solve a client’s problem, and converted a new client to the virtues of small multiples. Not bad.