F

From Space to Story in Data Journalism

“Two weeks ago today, a satellite whirled above Washington on its way around the earth and shot photographs from 400 miles up that could change the way some people do business.”
— The Washington Post, October 14, 1999

Almost 25 years ago, The Washington Post reported on the first picture delivered by the brand-new Ikonos satellite. It was the first commercial imaging satellite capable of acquiring data that rivaled the resolution of spy satellites. Curiously, the reporter, D’Vera Cohn, failed to mention that one of the industries that would be changed by satellite imagery would be the news business itself.

The launch of Ikonos was one of a handful of developments that allowed newsrooms to expand from reporting on rocket launches and satellite hardware, to using remote sensing data as an essential tool to help tell stories. A wide variety of satellite data are now used to provide context to the news, to document events, and as a tool for investigation. 

A handful of factors combined with the advent of commercial high-resolution data to help make remote sensing a resource for journalists. In the early 2000s, data from government research satellites became widely available for no cost. This trend culminated in 2008 when the entire archive of Landsat data, which once cost thousands of dollars per image, was released for free. At the same time, advances in computers enabled rapid processing and storage of large datasets. Google’s Earth Engine and other cloud computing services allow types of analysis, especially of time series, that once required supercomputers. Finally, an ecosystem of free and open source software has evolved to supplement the boutique commercial applications that were once required to read the (sometimes esoteric) formats used to store and distribute remote sensing data.

Instead of reporting on a scientist’s research or claims made by an intelligence agency, reporters could now tell their own stories with this data.

How journalists use data

Although the boundaries are fuzzy, I think there are several distinct approaches journalists take with satellite data:

Context – imagery that helps readers understand the larger picture or background of a story.

Documentation – imagery that shows an event, like an explosion, or the result of an event, like damage from a natural disaster.

Investigation – imagery that is used to draw a novel conclusion.

Context

Satellite imagery used to show context functions like a locator map. It grounds the  reader and helps them absorb new information. Sometimes this takes the shape of a map tucked away in the corner of a layout, as a  base layer with other information composited on top, or even simply as a visual element meant to draw the eye.

The example above, from Quartz, uses an image of central South America from a weather satellite to show the extent of smoke over the Amazon during the 2019 fire season. The picture helps orient the viewer, and demonstrates how intentionally set fires have a continent-wide impact.

Satellite imagery can serve as an excellent basemap – a background layer with more detailed or prominent  information composited on top. This works especially well when the imagery is  idealized or simplified, as in this map of a highway through the Brazilian Amazon published in the Washington Post. The Post’s graphics reporters lightened satellite imagery; combined that with contextual map elements, including indigenous and protected territories, water bodies, and population centers; and highlighted the route of the highway through the rainforest. This careful layering of elements created a visual hierarchy, with the most important information in the image’s foreground, and supporting information in the background. The result is a map that’s easy to read at a glance, while providing detailed information when studied carefully.

Another way to use satellite imagery as a basemap is to remove color (which can be busy and distracting) entirely. In this map of the Gaza Strip Carl Churchill of the Wall Street Journal used a lightened copy of grayscale  Landsat data as a background layer. Colored dots and squares, representing different types of water infrastructure, stand out against the muted background – but it’s still clear where they are in relation to the roads and urban areas shown in the satellite imagery.

Satellite imagery doesn’t always have to be imagery. (In fact, there isn’t really much of a distinction between satellite “imagery” and satellite “data”.) Imaging satellites measure the intensity of discrete wavelengths of light (colors) that are then used to calculate properties of the Earth’s surface, like vegetation health or cloud cover. In this case, the New York Times’ Mira Rojanasakul used satellite data from the Copernicus Programme (likely Sentinel-2) to distinguish land from water along the southern reaches of the Mississippi Delta.

Using satellite data instead of a map allowed the Times’s visualizers to show fine detail in an ever-changing landscape. The use of a palette that exhibits gradations between land and water, rather than a hard boundary, conveys that this is a region of marshes and tidal zones that may be dry one day and wet the next. This is all in support of the larger story – of the potential encroachment of saltwater from the Gulf of Mexico into the water system of New Orleans. 

More abstract satellite data can also provide context to a story. Sea ice dictates the path of any journey through the Northwest Passage, so a map of ice extent was essential to illustrate the route chosen by the sailboat Polar Sun in the summer of 2022. The cartographers at National Geographic merged 83 days of sea ice data from NASA to give a sense of the challenging conditions the boat had to navigate through. Note the attention to detail – the rough look of the ice on the map matches the appearance of the ice mélange that floats along the edges of the Arctic ice pack. (Watch this recording of Soren Walljasper’s 2023 North American Cartographic Information Society (NACIS) talk to learn more about the making of this and other National Geographic expedition maps.)

Documentation

A further role served by satellite data in media is documentation. Imagery of a specific place and time that shows something happening – an unfolding conflict, the impact of a natural disaster, construction, or change on the Earth’s surface.

In many ways, the “killer app” for remote sensing is observing conflict zones. The first high-resolution images from space were collected by and for intelligence agencies, organizations that monopolized the field for decades. Satellites provide access to far-flung areas that are dangerous, inaccessible, or both. Imagery can be a dramatic view of an unfolding event, as in the plume of black smoke belching from a Saudi oil facility in the aftermath of Houthi drone strikes above (created by me). Or it can be more subtle, like the pictures showing the construction of trenches and other defenses in Russian occupied Ukraine, below.

Satellites are an unparalleled tool for showing landscapes before and after an event, or change over time. Landsat, Sentinel-2, and other monitoring satellites have a predictable orbit, and take images from the same perspective and time of day on a fixed schedule. Combined with precise calibration, this allows comparisons over time that can show trends. The sequence of true-color Landsat images below shows the shrinking of California’s Salton Sea after years of drought. Despite being collected over the course of two decades and by two different satellites, the data can be analyzed to accurately show the position of the shoreline (and, with the right analysis, other properties like water quality or the health of the surrounding fields).

Different types of imaging satellites have different strengths and weaknesses: in general, the lower the resolution the broader and more regular the coverage. Higher resolution satellites image smaller areas less frequently. In addition, very high resolution satellites (1 meter per pixel resolution and better) must be tasked – instructed to take a picture of a particular spot on Earth at a specific time. It’s important to plan ahead if you are trying to capture an event.

Despite these complications, high-resolution data is still a useful tool for analyzing the impact of events – in particular natural disasters like hurricanes, fires, earthquakes, and landslides.

Hurricanes Eta and Iota both struck the indigenous community of Haulover, Nicaragua within a span of two weeks. The storms destroyed much of the village, slicing through the narrow barrier island the settlement was located on. These two SkySat images, with a resolution of about 80 centimeters per pixel, show the village before and after the hurricanes struck. The powerful storms opened an inlet linking the Caribbean Sea (right) to the Laguna de Wouhnta (left) directly through the tiny village. The New York Times used the pictures to illustrate the impact of the storms on the largely indigenous community.

As with maps used to provide context, there’s no reason to be limited to showing only true-color imagery for documentation. There are thousands of different data products derived from satellites, describing properties of the Earth’s surface and atmosphere from sea surface height to ozone. In 2023, for the first time in decades, a significant snowpack persisted in California’s Sierra Nevada and Klamath mountain ranges deep into the summer. The Los Angeles Times used daily snow depth data from the National Operational Hydrologic Remote Sensing Center to compare early summer 2023 to 2022, illustrating the vast difference in snow cover. (This snow depth data isn’t purely from satellite measurements – it’s a type of assimilated data that blends orbital, aircraft, and ground-level data with physics-based mathematical models to give a seamless estimate of snow depth.) 

As with many rigid categories imposed on a messy real world (species, planets, gender  …) there isn’t a distinct line between using satellite data for “documentation” and data for “investigation”. BuzzFeed News’s work tracking the growth of Uyghur detention camps in China’s western Xinjiang Province is a case in point. Satellite data augmented clues discovered on Chinese web maps to uncover mass detentions.

The Buzzfeed reporters identified the location of camps by looking for blank spots on web maps from Chinese search provider Baidu. Some of these missing tiles coincided with the locations of known camps and military bases, but many others were near seemingly innocuous industrial areas. Checking recent satellite imagery available from Google Earth, Sentinel-2, and Planet; the team discovered hundreds of newly-constructed facilities that shared features with known Chinese prisons, and matched the descriptions provided by detainees. As a reward for “clear and compelling” journalism, the stories won the 2021 Pulitzer Prize for International Journalism.

Investigation

To me, the most exciting use of satellite imagery in journalism is for investigative reporting – data as a research tool, used to make discoveries and draw inferences. One early and innovative example came from Reveal News in the story “Who is the Wet Prince of Bel Air? Here are the Likely Culprits”. The reporters – Michael Corey and Lance Williams – used a combination of techniques to identify the largest residential users of water in Los Angeles during the California drought of the mid 2010s. (State water agencies released a list of their largest water users, but could not share names or addresses.)

A measure of vegetation health called the Normalized Difference Vegetation Index (NDVI) helped identify properties in Los Angeles with large expanses of lush greenery. The vegetation measurements were derived from National Agriculture Imagery Program (NAIP) data, a free source of high-resolution aerial and satellite imagery refreshed every few years. This was combined with estimates of soil moisture from Landsat data, which is lower resolution than NAIP but provides information in additional wavelengths. The combined datasets gave more reliable estimates of water use than either technique used alone. To reduce the uncertainty further, Reveal even looked at the proportion of grass, trees, and shrubs on each property.

The result? A list of locations, each annually consuming millions of gallons of water, and an investigation by the Los Angeles City Council. An example of investigative journalism having an impact on the actions of local government.

I’ve already mentioned that a primary use of satellite data is to be able to monitor inaccessible locations. A great example of this is Bellingcat’s efforts to track illicit shipping of grain from the port of Sevastopol in occupied Crimea, Ukraine. In addition to being in a war zone, ships docking in Sevastopol often turn off their Automatic Identification System (AIS) transponder – effectively hiding their location. By obscuring their movements, ships can evade sanctions on exports from Sevastopol and transport stolen grain.

As with Reveal, the Bellingcat team combined multiple types of data to track hidden activity for the story “Grain Trail: Tracking Russia’s Ghost Ships with Satellite Imagery”. They used commercial high- and very-high resolution optical PlanetScope and SkySat imagery from Planet, plus open access medium-resolution Synthetic Aperture Radar (SAR) data from Sentinel-1. The Planet imagery revealed a ship docked at the Avlita grain terminal on more than 100 days in the year following the Russian invasion of Ukraine, despite incomplete coverage and frequent cloudiness. Sentinel-1 SAR data analyzed with the Ship Detection Tool (a machine learning algorithm run on Google Earth Engine) determined there was a ship present at the terminal on more than two dozen additional days. SAR can penetrate clouds, but is not available as frequently as Planet’s optical imagery, so even the combined dataset is likely an undercount.

Bellingcat reporters augmented the satellite data with photographs of the Avlita grain terminal in Sevastopol, and the Bosporus Strait that links the Black Sea with the Mediterranean. This “ground truth” information helped the researchers identify and track the individual ships spotted in Crimea. The combined datasets reveal the larger scope of illegal grain shipments in a way that is more comprehensive than any of the techniques alone.

Like Bellingcat, the New York Times used a mix of ground-based evidence, satellite data, and machine learning to monitor illicit activity. But instead of monitoring the motions of ships through time, the Times’s staff mapped unregistered airstrips across the Brazilian Amazon. They then analyzed additional satellite data to document illegal mining that occurred near the airstrips, and tracked aircraft delivering supplies.

Another example of researchers using machine learning and satellite data to detect illegal activity  is  “Myanmar’s Poisoned Mountains” by Global Witness. Since they’re advocates and not journalists they don’t quite fit, but I think the story of the growth of illegal rare earth mines along Myanmar’s border with China is one worth reading.

One of the more creative uses of satellite data I’ve seen is an analysis of the flight of the Chinese surveillance balloon that passed over Canada and the United States in early 2023. The story started with a machine learning approach similar to those I’ve already described, which was used to locate the balloon over North America and then track it back to Hainan Island, China. But that left an outstanding question – was the path of the balloon driven solely by wind currents? Or was it being actively guided? With no known source of propulsion, the only way to steer the balloon would be to adjust its altitude until it was carried along by favorable winds.

The Time’s Visual Investigations team took advantage of a quirk present in most satellite imagery – each color is collected at a slightly different time – to determine the balloon’s altitude. (You may have noticed rainbow planes while browsing Google Earth or similar satellite-driven map. The phenomenon is similar, except there’s additional spacing between each color due to an aircraft’s high speed.) Essentially, by knowing the speed and altitude of the satellite, and the elapsed time between each picture, they could estimate the balloon’s altitude with trigonometry. They concluded the balloon was, in fact, being guided – at least over some of its journey.

Pro Publica is well known for their deep dives into American politics, but they also report on a wide range of environmental issues, often with the help of remote sensing data. Their series on locations at risk for future Ebola outbreaks combined investigative reporting with original scientific research. The articles uncovered how the fragmentation of forests around networks of villages and towns in Equatorial Africa correlated with known outbreaks of Ebola, and identified places where the disease may next spill over from wildlife to humans.

The series combined satellite data – long-term records of changing global forest cover and settlement maps – with pattern-finding algorithms, calculations of forest fragmentation, cloud computing on Google Earth Engine, epidemiological models, consultation with scientists, and interviews with the people of Meliandou, Guinea who survived the worst Ebola outbreak in history. Their conclusion is not just a warning for at-risk communities, but also a set of recommendations to reduce the likelihood of  future outbreaks.

Most of my examples have shown reporting in far-flung locales (at least from my perspective in San Francisco’s tech industry), which is one of the primary strengths of satellite data. The data journalists at texty.org.ua, however, had to deal directly with tragedy and trauma when Russia invaded Ukraine in early 2022. They responded with some of the most detailed reporting on the impact of the war I’ve seen, despite working during blackouts and while sheltering from air raids.

Texty used multiple types of data to cover the war – including high resolution commercial imagery, night lights data, and NASA fire locations. Combined, the datasets give civilians whose lives have been upended by the invasion a means to investigate and respond to the tragedy in Ukraine that has been forced upon them. The stories reflect their interests and priorities.

Approaches for using satellite data in the newsroom

The use of satellite data is rapidly increasing in journalism, a trend fueled by growing availability, higher quality, and the development of more usable analysis tools. What does it take to successfully use this data to tell stories in a newsroom, and develop innovative reporting?

Teamwork: It’s difficult for a single reporter to have the wide range of skills necessary to fully exploit the potential of satellite data. Teams with expertise in a range of fields – investigative reporting, writing, design, programming, and data analysis – are conducting the most novel and impactful data journalism.

Data literacy: Satellite data comes in many forms, suitable for a wide variety of applications. Knowing what data is available, and the strengths and weaknesses of each type, is essential for using it effectively.

Outside experts: The field of remote sensing has thrived for over 50 years. In that time scientists and technicians in government, academia, and industry have developed techniques to derive insights from data. They’re an invaluable resource for both background information and innovative new ideas.

Local knowledge: Data collected from a few hundred miles above the Earth’s surface is often limited when used in isolation. It is far more reliable when combined with in-situ data, augmented by on-the ground reporting, and (perhaps most importantly) informed by the perspective of the people who live in the areas being imaged.

Over the past ten years satellite imagery has become an important component of data journalism. In the next ten it will likely evolve further, from a tool used primarily to illustrate stories to one that is an integral part of research and  investigative reporting. I’m excited to see how reporters develop innovative uses of existing datasets, and explore new types of data.

Robert Simmon is a pioneering designer and visualizer renowned for his work in cartography and science communication. With decades of experience at Planet and the NASA Earth Observatory, he transforms satellite data into captivating imagery. Robert’s work has appeared on the front page of the New York Times, the cover of National Geographic, and he crafted the iconic Blue Marble featured on the original Apple iPhone. His expertise in data visualization, remote sensing, and the use of color has left a lasting impact on the field. He is currently freelancing and open to full-time opportunities.

CategoriesData Journalism