Asterisk Nation: One Tribe’s Challenge to Find Data About its Population

The Yurok Tribe in far northern California needs to address a condition plaguing numerous rural communities in the United States: addiction and substance misuse. Across the U.S., government agencies are increasingly turning to data to help plot their next steps to combatting addiction. In California, for example, even sparsely-populated counties can analyze and visualize a range of data, from emergency department visits for overdoses to zip code-level data on opioid prescriptions, to inform decisions they need to make and evaluate the impact of their interventions.

While California collects racial and ethnic data on a host of issues, from opioid overdose to COVID prevalence to academic outcomes for students, data for Native Americans is reported less frequently, or unreported due to small sample numbers and policies that hinder collection.

The problems with data collection facing the Yurok Tribe are not unique to California, nor are they specific to Native American populations. What the Yurok Tribe experiences exemplifies a broader issue that data analysts and visualization practitioners should confront. How can we analyze findings and visualize results when data for important communities are simply not reported?

This is the question we, a data storyteller and an epidemiologist, posed to ourselves as we set out last summer to work with the Yurok Tribe Wellness Coalition as part of a technical assistance program sponsored by the California Overdose Prevention Network, a program of the Center for Health Leadership and Practice.

The issues the Yurok Tribe face helped us better appreciate what so many groups contend with and allowed us to puzzle through what can be done to help a community aiming to confront modern challenges by leveraging data. Beyond simply obtaining broad data about Native American status, the Yurok Tribe also needs more specific tribal affiliation identification, which represents a political designation of the tribe’s sovereignty. Alas, this information, which can help the tribe provide necessary services while preserving important traditions, is rarely available.

Although the Yurok Tribe is California’s largest, at about 6,300 enrolled members, it simply can’t access crucial indicators of how members are faring. As Lori Nesbitt, the opioid program manager for the Tribe’s Wellness Coalition, observes, they often don’t get any data until a member dies from an opioid overdose, when it’s obviously too late to provide supportive and life-saving services.

Those of us who work frequently with data understand what’s at issue here: epidemiologists, statisticians, and analysts reporting racial and ethnic information are trained to suppress populations with small numbers, or aggregate several smaller groups together. Although these are accepted as good statistical practices, these approaches often fail to articulate trends at the micro-level, which challenge an array of communities in the U.S., including tribal populations, Native Hawaiians and Other Pacific Islanders, Middle Eastern and North African populations, and other ethnographic groups.

In short, the aggregate means we aren’t looking at the full story. As California Governor Gavin Newsom often observes, “We don’t live in the aggregate.” Disaggregating smaller populations (whether they are racial and ethnic groups, by tribal membership, or some other important feature) is a technique that analysts and data storytellers should include in their toolbox to advance health and equity, even if it bumps up against statistical practices. There are strategies for disaggregating data (combining multiple years of data, oversampling smaller populations) while maintaining statistical rigor.

But if we can’t disaggregate, we’re left with incomplete information, and these blank cells in tables and empty spaces on graphs are often visualized by an asterisk, indicating suppressed data. The National Congress of American Indians’ Policy Research Center says it well:

“American Indians and Alaska Natives may be described as the ‘Asterisk Nation’ because an asterisk, instead of a data point, is often used in data displays when reporting racial and ethnic data due to various data collection and reporting issues, such as small sample size, large margins of errors, or other issues related to the validity and statistical significance of data on American Indians and Alaska Natives.”

However, beyond issues of data analysis, there are important historical and contemporary factors at play — namely the genocide and oppression of Native Americans which pre-dates the founding of the United States of America. During our project, we learned from our Yurok Tribe partners how this history and its legacy plays out today, even through our data systems. The U.S. Census, for example, did not count Native Americans until 70 years after the inaugural census in 1790 (to learn more, check out this timeline from the U.S. Census Bureau and this commentary from the Pew Research Center). And in a more modern example, CNN’s election coverage this past November reported out results from Native Americans as a group they termed, “something else,” which was offensive to people of all racial/ethnic backgrounds, particularly Native Americans and other communities of color.

The end result is a paucity of data, and, put simply, you can’t visualize an asterisk. If the data are not there, how are we to know and visually describe how these populations are faring?

Through our project, the Yurok Tribe Wellness Coalition sought to better understand what data were being collected on Native Americans (and specific tribes) by public agencies across Humboldt and Del Norte counties, where the tribe is located, so that data reporting can be improved to better support the Yurok people and prevent opioid overdoses. We partnered with the Coalition to conduct interviews with public agencies (social services, health, law enforcement, education) to learn more about their data systems and practices.

What did we learn?

  1. Tribe-specific data — and data on Native Americans in general — is not regularly collected by the eight public agencies who participated in our assessment.
  2. Data sharing policies are in place between non-tribal and tribal entities, but they are underutilized.
  3. Despite the challenges, public agencies are interested in partnering with tribes to improve data collection and reporting. All agencies think there would be benefits to the larger community through better data collection and sharing.

So what broader lessons can be drawn from this project? Simply being aware of who’s not measured is an important first step. Next is to talk to the tribes, and other populations, who may be made “invisible” in data about how we can do better. It’s only in partnership that we can start to make data more representative of all groups.

The changing categories the U.S. Census Bureau has used to measure race. Credit: Pew Research Center

As the census count wraps up in the United States, we’ll soon analyze results and create illuminating visualizations summarizing the findings. As we do, however, it’s important to account for those who are simply not counted, or who are undercounted by federal, state, and local agencies who have no data, or don’t report the data they do have.

In the coming months and years, as census data are compiled, released, analyzed, and visualized, and as we fret over and visualize COVID-19 findings — including now, the need to obtain racial/ethnic breakdowns for vaccination data — let’s keep in mind who we don’t count, or who we undercount. Let’s remember that we’re often not able to visualize information about Native Americans and the hundreds of tribes in the United States, as well as Asian Americans, Native Hawaiians, and Pacific Islanders — such as Hmong, Filipinos, Cambodians, Fujians — and many more groups that we typically combine together into broad racial and ethnic categories. We need to advocate for them and for the release of their data, recognizing that the results from such data activism can catalyze social change and empower these communities to improve issues of dire importance like drug overdose.

Andy Krackov runs a data storytelling consultancy, Hillcrest Advisory, that works with social sector organizations – foundations, universities, nonprofits and government agencies – to help them understand audience use cases and communicate their data findings in engaging and persuasive ways. Prior to establishing Hillcrest Advisory, Andy worked at two California-based health foundations where he funded data communication initiatives to raise awareness of pressing issues and ensure local communities have access to the information they need for decision-making. Andy recently has been developing data storytelling training content, including a continuing education course for health professionals on communicating with data that’s offered through George Washington University (https://www.innovationhorizons.net/from-spreadsheet-to-story/).

Sarah Marikos provides epidemiological and public health data services to a variety of philanthropic, government, and nonprofit organizations to help them advance community health and health equity.