This article is part V in a series on data exploration, and the common struggles that we all face when trying to learn something new. A list of previous entries can be found at the end of the article. I’m exploring the tools data from the State of the Industry Survey, to illustrate both how I approach a new project, and the fact that no “expert” is immune from the challenges and setbacks of learning. In addition to working with a new dataset, I am also using this project to take my first steps toward learning R. Let’s see where this journey takes us!
The previous article left us wandering around in productive tangents, exploring all of the remote corners of a problem before settling in to get to work. This is a good, productive space, but if it goes on too long, it will hurt your focus, your motivation, and your project.
Someone asked me recently how I know when to stop. I had to laugh, because I’m not at all sure that I do. (In fact, I would say that there are several signals in my life that indicate that I don’t! Whether you consider this to be a good or a bad thing is a matter of taste.) Ultimately, setting the limits for your exploration is a a combination of experience, trusting your instincts, and knowing how to manage your timelines and energy levels. The amount of exploration that I do is always a balance between my interest, motivation, the time available for my project, and the core destination that I’m trying to reach.
After several articles in the expand phase for this project, now it’s time to start moving into the focus phase instead. This phase is all about pruning back, cutting down, and simplifying to get a clear picture of where you want to go.
Knowing when to stop
Precisely when and how to stop is possibly the hardest question in any exploratory adventure. You never know what will turn out to be wasted effort, or what amazing discoveries lie just around the next bend in the road. Here are some heuristics that I use to decide when it’s time to pull back or narrow down:
- When it looks like a dead end. If you’re pretty sure that a tangent is leading nowhere, there is little reason to pursue it. Pick one that looks promising instead.
- When you’ve understood what you needed to see. At this stage of a project, I am focused on identifying the structure of the problem and the core elements of the data rather than finding a particular answer or solution. Clinging to a need for completeness will only bog you down in the early discovery period. This is about sketching, not finalizing a masterpiece. If you’ve gotten what you need, move on.
- When it starts to feel overwhelming. This is usually a sign that your energy budget is running low. You need to refocus, change modes of exploration to exercise a different “muscle,” or take a break.
- When you start to lose focus. If I’m losing track of why I got into this thing in the first place, then it’s probably time to pull out. This can be a hard one to call, because there’s a fine line between losing focus and stepping out of your rut to see things in a different way. My personality tends to lean more toward discipline, focus, and clear goals, so I often make a conscious choice to indulge here and encourage myself to stretch into a less familiar space (for a short time, with a clear stopping point). If I’m losing my connection to the core purpose of the problem, then it’s a good time to re-evaluate.
- When the threads start to dissipate, rather than converge. Judging this one is tricky, and it’s ultimately a matter of intuition and experience. I will follow a lot of tangents for a step or two, but if it’s clear that they’re heading off into the wilderness and not toward my core goals, I’ll step back and redefine.
Letting go
Sometimes, the hardest part of stopping is that you don’t really want to let go. If fear of stopping is your challenge, it can be helpful to remember a few things:
- You can always come back. You’ve taken good notes and left a trail, there’s no reason that you can’t pick this up again later. Let go of the false urgency that demands that things must be done right now.
- If you’ve learned something, no effort is wasted. Sometimes we get so attached to the time we’ve put into a project that we feel like we can’t walk away. This is so common that there’s a name for it: the sunk costs fallacy. There are shelves full of books on decision making that talk about how fear of cutting losses leads people to make bad decisions. You don’t need to fall into that trap. Take what you’ve learned, accept what you’ve already paid, and choose not to spend your life throwing good time after bad.
- Thoroughness isn’t always commendable. Many of us have succeeded in life because we work hard, we don’t give up, and we always do a complete and thorough job. Those are all good traits to have, but if you invest too much of your identity in those metrics, sometimes they work against you. There are times when it pays to be thorough, and there are times when it’s downright silly. Not finishing is actually the smarter choice when it increases your chances of getting where you need to go.
Switching into focus
The key feature of the focus phase is that we want to narrow things down, not open them up. That can sometimes be painful, but it can also be freeing. For some people, the expand phase is the hard one, and getting to focus feels like a relief. I really enjoy the expand phase, but I also appreciate the clarity of focus. I think of it as an opportunity to put down all of the options I’m carrying, so that I can invest all of my time and attention into a single path. Instead of asking “what do I need to do?” the focus phase is all about asking “what do I need to do first?”
Here are a few of the steps I take to switch from the exploratory stage into focus mode.
- Stop, and re-evaluate. You’ve learned some things and uncovered some intriguing potential. The world is full of possibilities. Now it’s time to ask: what am I really trying to do? Be ruthless when assessing what is truly necessary to achieve your goals.
- Compile key information. I will often create a list of things to include, things to leave behind, and things to come back to some other time. This can also help you to identify whether there is more information that you need.
- Identify the steps needed. The focus phase can be a long haul, and it’s important to be able to see your progress along the way. What needs to happen for this to succeed? How will you know that you’re making progress? How will you know when you’ve arrived?
- Build a plan. Are there things that have to happen first? Which ones are the most interesting? Which are the most fun? Which tasks are likely to present a challenge? Don’t forget to budget your energy, too: that’s where you’ll find the stamina for the long haul. Make sure to plan consistent “energy snacks” along the way.
- Restrict your scope. Pick something small to start with, and do the work. Let small successes build on one another, rather than tackling the whole thing at once.
Focusing on the survey project
As a first step, I went back through all of my notes and files and did a quick review. I made a mental list of the dead ends, good ideas, and new things that I wanted to explore. Then I looked carefully at what I thought would be a good candidate to do first. Jumping straight into the complex analysis would be a mistake (I’d have no way to know if it was even close to right, and I’d just get frustrated and lost before I’d even begun.) Instead, I focused on making a simple chart first, just trying to figure out the basic steps needed to get things going.
List of objectives
- Learning R: Build out a simple chart, understand the mechanics of basic data manipulations.
- Tools analysis: Regroup on all of the different analyses, and identify one that’s a good candidate to start.
- Project strategy: Learn enough to create a plan for pursuing the more complicated analyses.
First step: Build out a simple chart
- Load the dataset from .csv.
- Clean the data, removing NAs and other items that I don’t want to count in the final piece.
- Work on information from a single column first; the tools analysis requires across-column manipulation, but that’s a lot harder to figure out. Start with the basics first.
- Repeat an analysis that I’ve already done in Excel. Replicating a working example allows me to check the numbers and retrace my steps.
- Figure out how to aggregate data in R, and learn how to manipulate the data object to do what I want.
- Format the data for use with a charting library.
- Create the chart!
- Apply styles and understand how to tweak the display.
Notice that most of these items are very small, concrete steps, and it will be easy to tell if I’ve done them. But then there are other items (anything that starts with “figure out”) that are much less defined. Those items are outside of my experience and off the edges of my current map. I will need to push harder to understand what those tasks mean, and each one will probably require its own expand phase to learn what I need to know. I’ll want to keep those explorations tight and focused to avoid getting pulled off track, but it’s good to realize that I’m going to need them up front, because that helps me to set realistic expectations for how long it will take to do this task. It’s easy to look at a focus list and assume that it will all be easy, but it’s important to take stock of the unknowns and the risks, too. Setting reasonable goals at this stage is a big part of getting successfully to the end of the project without giving into frustration or burning out.
Once your plan is identified, the next step is simply to get to work. I usually find the clarity of this stage quite exciting, and really enjoy the simplicity and directed activity that the focus phase creates, especially after doing so much wandering during the early stages of the exploration. Sometimes it’s hard to make the important cuts, but it helps to know that they’re necessary if you want to keep your project on track.
Previous articles in this series:
Embrace the Challenge to Beat Imposter Syndrome
Step 1 in the Data Exploration Journey: Getting to Know Your Data
Step 2 in the Data Exploration Journey: Going Deeper into the Analysis
Step 3 in the Data Exploration Journey: Productive Tangents
Erica Gunn is a data visualization designer at one of the largest clinical trial data companies in the world. She creates information ecosystems that help clients to understand their data better and to access it in more intuitive and useful ways. She received her MFA in information design from Northeastern University in 2017. In a previous life, Erica was a research scientist and college chemistry professor. You can connect with her on Twitter @EricaGunn.