I had the great pleasure of seeing Catherine D’Ignazio speak at EYEO festival this past summer. I was totally inspired by her message, and while I had heard about her forthcoming book, Data Feminism — which will be released this coming March by MIT Press — I had not read it at the time. Luckily for everyone, Catherine and her co-author Lauren Klein, have published the first draft of their book publicly, which we can all read online now.
As a data visualization professional with more than a decade in enterprise-scale product design, I can say that this book is nothing short of a revelation. Data Feminism confronts many tenets of how we collect, store, and manipulate data. The book challenges how our teams are organized and who creates the analysis; it makes us take pause at who makes decisions as a result of this work.
Data Feminism is a surprisingly light book, one told with generosity and humor. While it discusses some difficult subjects, it takes a human look at the forces that created them, and the humanity it will take to resolve them. It’s an easy read and one that has already caused me to adjust certain behaviors and take action. I believe it will be a watershed work that will help alter the trajectory of data science as a whole. I’m so pleased to have spoken to Catherine earlier in the month to discuss all of this.
Jason Forrest: First of all, maybe you can quickly introduce yourself, and then because Data Feminism hasn’t been officially published yet, maybe you could introduce the book?
Catherine D’Ignazio: Hi, I’m Catherine D’Ignazio. I’m a new assistant professor of urban science and planning at Massachusetts Institute of Technology where I’m in the Department of Urban Studies and Planning.
In relation to where the book Data Feminism came from, my own background is in software development; I was actually a freelance software programmer for 15 years. That’s how I paid the bills while I was an independent artist and designer — things that were much less lucrative. In many projects, I wove in a kind of social justice and a long history of being associated with counter cartography and critical cartography works.
When I went back to school at MIT in 2012 at the Center for Civic Media at the MIT Media lab there was all this hype around “Big Data.” One of the things that was really striking to me was just how “modernist” the discourse around data was. There was this kind of assumption on the part of a lot of people that the data was a one-to-one representation of reality. They would look at all the cool correlations, taking these raw inputs and translating them with magic into other outputs. I was like, “Oh wow these folks need to read some Donna Haraway. They need to talk to some of the critical cartographers, folks that have been challenging map-making.”
I was going to this forum organized by my friend Mushon Zer-Aviv that was about responsible data and visualization. And so Mushon was like, “Oh you should really write something that we could discuss at the Responsible Data Form,” so I wrote a very short blog post which was called What would feminist data visualization look like?
I was thinking: “What would we mean by that? What are some of the things that we would have to shift in order to make data visualization feminist,” and to my surprise, it kind of went viral in my community.
One of the librarians at MIT put me in touch with Lauren Klein, who had just given a lecture at Northeastern, in Boston. She had written this whole blog post about what feminist data visualization looks like from a historical perspective, even though they haven’t been canonized in official histories of data visualization. We met in person and really liked each other’s different approaches.
We wrote a short paper, “Feminist Data Visualization” for the IEEE Vis conference, then realized we can’t think about just feminist data visualization — because the visualization part comes at the end. You can’t make feminist data visualization if all the stuff that came earlier in the process was accomplished in some oppressive or terrible way that was reinforcing existing power structures. So that’s kind of where the impetus for Data Feminism came about.
We took that initial work about the communication side of data and backed up to look at the issues of power and inequality that affect the whole data science process. How do we point some of those out and showcase work where people are already doing feminist data visualization or feminist data science, even though they might not consider themselves as such?
But often I think when people see or read that work they are like, “Okay, everything’s broken. What do I do?” So we’re trying to be a little bit more productive — especially for folks who are practitioners. We’re thinking: “What do we do about this? What do we do with this information and our technologies and algorithms and models and other things that are flawed? How do we move forward?”
JF: You describe Data Feminism as “a book about power and data science.” Can you unpack those terms and who you think data feminism is specifically for and why it’s particularly important right now?
CD: When we say power, what we really mean is oppression, like oppressive power structures. Another way to say that would be inequality and where that comes about. I should also explain what definition of feminism we are using in the book. Feminists don’t all agree with each other, there are many different histories of feminism and that’s why I think it is important to be specific. And so, we are very much drawing from Black Feminism, which is a tradition that originated in the US with Kimberle Crenshaw, Patricia Hill Collins, and the Combahee River Collective.
One of their main assertions, which has come out of the past 40 years or so, is that we can’t only think about gender inequality. We also have to think about all the other sorts of oppressive forces that are at work in the world. We have to think about racism, classism, and all the structural inequalities. (They might look a little bit different in different countries and different cultures, but they exist.) What it means is that some people are systematically privileged and other people are systemically disadvantaged and marginalized. So while almost everything we discuss has a gender lens to it, we try to talk about the intersection of gender and race, or gender and class, and so on.
We try to ask questions like: How do these structural inequalities permeate the data science process? How did they infect every stage of the pipeline? What kind of questions even get asked? Right on down to what kinds of colors you choose to represent different things in your chart. So that’s kind of the intentionally broad focus on power while still being focused on gender throughout, but like, power writ large.
JF: So considering how many “isms” and how serious a lot of the work is, I was really struck by how you’ve told the story of Data Feminism in a relaxed and fun way. It’s an easy read and engaging. I found myself laughing a few times. How did you come to develop this style of writing for this book?
CD: In some ways it came naturally to us, and at other times we’ve wrestled with it because the subject matter is very serious. We tried to balance some of the lightness with some of the seriousness. There are some places where we have to talk about things that are very violent, or very unfair, or just very terrible — and those are places where we do not inject that kind of tone.
But at the same time, we were striving for an approachable introduction for folks who are new to data or folks who are new to feminism. For a lot of people in technical communities, like programmers or academically trained designers, feminism is not something they were exposed to in their education. Introducing those ideas in a way that is approachable and friendly for newcomers was one of the goals, and demonstrating their applicability to more technical concerns.
At least one of my goals as an educator is that I try to meet people where they are. In a lot of cases in my learning trajectory, especially along the lines of things like racial equity, people have given me grace and generosity to not know all the answers, and to point out to me where I’m doing things wrong. We’re all at different stages of learning in that sense, so we try to meet people where they are and offer them some useful concepts.
JF: One of the things I wondered is if you had an idea about whether data itself can be objectively good or bad.
CD: No, no. No, I really don’t think so. I mean, I don’t think it’s necessarily good or bad. I guess the context is everything — like who is using it, how they’re using it, for whose benefit, with whose values and principles in mind. I think that’s the way it matters the most deeply.
Humm. I don’t quite mean that either, because actually, we have developed a very specific set of technologies and methods — I’m thinking, like, inferential statistics or something — as a kind of body of knowledge, which you could make an argument for being objectively terrible. If you look at the history of statistics, it’s tied in with eugenics and racism and all these things. So we inherit these histories.
For example, think about maps. Maps come about because of, basically, European nation-states trying to dominate and exploit the world and commit genocide. That’s our history of maps that we inherit. That doesn’t mean that those tools and technologies can not be re-engineered for other kinds of purposes. While we inherit a very flawed history when we use these flawed tools, that doesn’t mean that we can’t take steps towards justice or more emancipatory uses of those same tools.
JF: Now that I know that Data Feminism actually started by exploring feminist data visualization, why do you think that the visual display of information is so important?
CD: Because the communications interface is the part that actually goes out into the world. Right? The visualization is like the ambassador for the knowledge generated in the data science project. It’s going to be seen by many more people. Probably those people will not have the same level of expertise that whoever did the study has, and so in that sense, I’ve always been really interested in visualization and maps as communication objects. What world does this particular visualization or map portray? I see the communication of data as being extremely important — the first stepping stone into the subject matter for that future audience.
To go back to the “who’s it for” question for Data Feminism, we really saw this as borrowing from bell hooks where she says “feminism is for everybody.” So we say “data feminism is also for everybody.” It’s for men, it’s for people of all genders — it’s not only for women. To be more specific, we think about it as a book for those newcomers to either data or feminism, who might be coming from the more technical side. So it might be computer scientists or data scientists or statisticians, or it might be for folks in women and gender studies or folks in the humanities, like designers, journalists, librarians — what I would call public information fields — which are just so crucial and important right now.
I’ve been disappointed because a lot of the conversations going on around data ethics in the establishment recently are a lot of these new ‘centers for data ethics,’ located in a computer science department at a university. I find that very depressing. Because if we’re going to solve these issues on the implications of data in society for democratic outcomes, then the computer scientists are not the ones who are going to solve it. They have some great methods, but we need a lot of different people at the table. Let’s include all the people that I just mentioned, the friends from the humanities and social sciences, who have this wealth of information that can contribute to these conversations.
We’re trying to bring together all these different folks. Or at least, make a book that is accessible to all those different folks, to show some of these ideas originated in the humanities — because feminism largely comes out of humanities, and social science ideas, and thinking around inequality. These ideas have relevance for technical disciplines.
JF: I’d love to talk about the data-ink ratio. You have a different conception of it than many data visualization practitioners, so could you explain your take on it?
CD: Sure. Like I said earlier when I referred to the dialogue around visualization as modernist, that’s how I find the data-ink ratio from Tufte. This idea that we should devote the least amount of ink possible to any kind of decoration and embellishment of the data visualization. Like, it should be about the data and nothing more than the data — anything else is a distraction. To be a strict modernist, there’s a minimalism movement to strip down painting to the basic elements. Everything else is banished.
I think it’s dangerous as designers when we draw these hard and fast lines. We’re not the first to criticize the data-ink ratio, as there have been folks from the art spaces who have looked at maximalist data visualization. Or as Kelly Dobson says: data visceralization; work that’s for all of the senses of your body, not just your eyes. Then there are folks from the scientific and technical visualization community who have demonstrated empirical studies. Again, it’s context-dependent, but in many cases, it doesn’t make sense to follow the super minimalist approach if what you care about are the basic things, like people remembering the graphic or people recognizing what it’s about.
So why is that? There’s a great paper by Scott Bateman and others, that shows a great example by Nigel Holmes with a monster. Edward Tufte criticizes this as a terrible distortion of things, but actually in a lot of ways it makes sense if you’re thinking with your designer or artist hat. Because if we use things like novelty and attention-grabbing tactics, they are the emotional aspects of communication that Tufte would banish. We actually see these are really good for recognition, learning, and remembering. Plus it’s just more fun!
So that would be our pushback — not to say “don’t ever do a minimalist data visualization” — but that we need to be careful when we establish these hard and fast rules. Often what we’re doing is establishing unspoken hierarchies where we’re saying ‘only reason shall enter here and emotion, you shall be banished.’
JF: Your book is filled with ideas such as anti-oppressive design, co-liberation, and a general reevaluation of bias detection in data science. Do you have any recommendations on how individuals of larger organizations can begin to operationalize these kinds of concepts into work at scale?
CD: That’s such a great question. One of the things I firmly believe is that you can ‘do feminism’ from wherever you are and whatever level of power that you have. You can practice feminism from that standpoint, so we don’t have to wait until like, racism has been dismantled before we begin.
I like the “think local” thing: to focus on what’s in your particular purview that you can change, even if you don’t have direct control over them. If you’re in a corporation, you can escalate things up the chain, you can organize people where you can do things differently. Doing a scan of the environment and thinking “what are the ways that sexism and racism enter into our work?” and “how would we re-imagine doing our work differently?” A lot of those are things that are not easy — making a change in any way is never easy — but it’s within your power to make them.
Depending on where folks are, they might not have complete control, but they might be able to change the composition of work teams, they might be able to influence hiring practices, they might be able to get a seat at the table at higher-level discussions about how things should be working.
We try to point out the ways we can make changes which we might not think of as being related, such as the identities of the people making something. It’s not an accident that we’re making a bunch of racist and sexist AI products right now because it is primarily dominant groups with little gender and racial literacy who are doing the work. We don’t catch these things because nobody on the team is looking for them, because those people are not the people who have the power. So all the identities of the people who are doing the work matter.
All of these kinds of policies for work and well-being making, its helpful and at least makes it as comfortable as possible for women and people of color and low-income people to be in the work environment. All of those things actually really matter. You might think that’s separate, that’s somebody else’s job — no that’s all of our jobs.
I always say this one is hard, but pushing back against your own privilege is important. We described in the book “the privilege hazard,” that sometimes we cannot see the privilege that we have. Me, for example, I don’t experience racism and so I am not as sensitive as I could be to the times when racism has permeated my environment. That fact also gives me fewer skills to speak out when it does.
It’s worth thinking about what the lines of difference are and how can I be a good ally or accomplice. Like, how do I be a good ally or accomplice to the women in my working environment? How do I be a good ally or accomplice to the people of color or disabled folks in my environment?
So thinking across those lines that are tied up with this idea of co-liberation: that none of us are free if some of us are not.
Thank you so much to Catherine for taking the time to speak with us!
I wholeheartedly suggest you go take a read through the draft version of Data Feminism! It’s an amazing book:
Lastly, here is Catherine’s whole presentation “Feminist Data, Feminist Futures” at EYEO Festival, 2019: