At the Vanguard of Interface Design

Implications for visualization from the 2022 UIST symposium

The ways in which we interact with data and visualization may look and feel very different in the years to come. Many of our current interfaces to visual representations of data can be described as WIMP (Windows, Icons, Menus, Pointer) systems, assuming indirect manipulation via a mouse and keyboard. Recent advances in artificial intelligence / machine learning, computer vision, environmental / physiological sensing, and material sciences may very well shape our future interactions with data.

Between October 29th and November 2nd 2022, the 35th annual ACM Symposium on User Interface Software and Technology (UIST) took place in Bend, Oregon. It was the first in-person research conference that I attended in over three years. I was there along with more than 480 attendees from 25 countries, including more than 270 graduate and undergraduate students. UIST is a highly selective conference, soliciting the best new research on interface design: of 372 research paper submissions, 98 (26%) were accepted to be presented at the symposium. UIST is also a great place to experience demonstrations of new interface prototypes, and this year didn’t disappoint with 64 demos. In addition to technical paper presentations and demos, there were two excellent keynote speakers (Ted Chiang and Marissa Mayer) as well as two shorter “vision” presentations by Mind in Motion author Barbara Tversky and Microsoft’s Chief Scientist Jaimie Teevan.

Unlike research conferences that are devoted to data visualization (such as IEEE VIS), UIST is broader in scope. While there was one session devoted specifically to information and visualization interfaces, I want to highlight a few interesting ideas and opportunities across the conference and comment on how they may be applicable to the future of data visualization. There were eighteen paper presentation sessions over the course of three days, each featuring between four and six presentations, with two sessions taking place concurrently. Given this schedule, I was only able to see about half of the presentations (all sessions were recorded and can be viewed on YouTube). Below, I include links to short video previews and open-access copies of the relevant articles where available.

Given my role as a researcher specializing in data visualization, I decided to frame my UIST highlights as questions for future visualization research, design, and development to consider.

Can data visualization shape our thoughts like written language does?
How might we talk about data in engaging ways?
How can we make sense of relationships in text and network data?
How might visualization manifest in generative design workflows?
How can sonification and physicalization help people who are blind or visually impaired people understand sensor data?
How might we visualize and interact with data in extended reality?
Could reprogrammable “mixels” and shape-changing materials be used for dynamic data physicalization?
How can we ensure our data visualization satisfies the 3 Ps of good graphics?

Can data visualization shape our thoughts like written language does?

Source: https://unsplash.com/photos/0-G-N0-0FOA

The first keynote speaker was award-winning science fiction writer Ted Chiang, whose “Story of Your Life” was the basis of the Academy Award nominated 2016 film Arrival. His keynote address reflected on how modern humans have collectively internalized the invention of the alphabet: once we learn to read and write, this technology shapes how we think. By evolving from an oral culture to a literate culture, we now longer had to memorize knowledge and therefore lost a dependency on rhyme and meter as mnemonics.

Modern technology has allowed us to further externalize our cognition, but we haven’t internalized these inventions in the same way that we have with the written word, leaving us with the feeling that modern technology is dehumanizing: we rely on artifacts and devices, and we feel paralyzed when we are without them. I prefer to see visualization as a humanizing technology, one that makes data more interpretable and accessible. Moreover, given that we are capable of visualizing data using everyday objects and without the aid of computers, have we internalized the process of data visualization in some capacity?

In rare cases, it would appear as though some exceptionally gifted individuals have internalized mathematical notations and the structure of computer code into their thinking, but unlike the alphabet, these technologies are not co-extensive with speech; thus, this internalization is not as broadly applicable to all human activities. These observations left me wondering about the ways by which we might internalize aspects of data visualization into our thinking, and the power of visualization as a communicative language.

Finally, Chiang remarked that human alphabets are both easy to recognize and easy to write, and he asked whether there are shapes that are easy to recognize but difficult to generate without the aid of a computer? I would argue that many manifestations of data visualization fit this description, and particularly those that employ shape as a visual encoding channel. For instance, vision science research by Liqiang Huang (2020) would suggest that basic geometric shapes that are easy to draw may not be as perceptually separable as shapes that vary according to the dimensions of segmentability, compactness, and spikiness, which include shapes that may be difficult for humans (but easy for computers) to draw.

How might we talk about data in engaging ways?

My own contribution to UIST 2022 was a new approach for presenting data to remote audiences via augmented webcam video (preview), in which interactive and semi-transparent visualization overlays are composited with the speaker’s video. These overlays are also interactive and yet touchless, in that highlighting and selection are made possible with continuous bimanual hand-tracking. Former Tableau Research intern Brian Hall of the University of Michigan presented our paper, which received an honorable mention for best paper. We also invited attendees to try out our prototype at the symposium’s demo exhibit.

A frame from the video preview for "Augmented Chironomia for Presenting Data to Remote Audiences" by Hall and colleagues, in which a man stands behind semi-transparent chart overlays and interacts with them by pinching and pointing. — Image credit: ACM SIGCHI https://youtu.be/W0l5cTuindE

In the same vein, Jian Liao of the University of Calgary presented RealityTalk (preview), in which webcam presentations of multimedia content could be augmented in real time with kinetic typography of keywords, and the invocation of images and media associated with those keywords, placed wherever a hand or tracked object happens to be within the video frame. I was intrigued by the prospect of combining this technique with ours, in which you could invoke dynamic visualization content and value or category annotations by uttering keywords associated with data.

Four frames from the video preview for "RealityTalk: Real-time Speech-driven Augmented Presentation for AR Live Storytelling" by Liao and colleagues, in which a woman points to keywords and images composited over the video frame. — Image credit: ACM SIGCHI https://youtu.be/lRSai-XRYyk

How can we make sense of relationships in text and network data?

One of the highlights of the visualization session was the Scholastic project (preview) presented by Matt Hong of the University of North Carolina. For anyone who has ever undertaken a qualitative thematic analysis of a text corpus (e.g., documents, transcripts), the Scholastic tool visualizes this corpus at multiple reading levels, one that prioritizes human-in-the-loop semi-automated clustering that aims to remain transparent and trustworthy, aspects that are critical to interpretive analysis.

A frame from the video preview for "Scholastic: Graphical Human-AI Collaboration for Inductive and Interpretive Text Analysis" by Hong and colleagues, showing three interfaces from the tools:a hierarchical clustering visualization of a document collection, a document reader, and a code examiner / word clustering interface. — Image credit: ACM SIGCHI https://youtu.be/vqOtS-AeLbE

Moving beyond text corpora to large article collections and their related metadata, the FeedLens project (preview) presented by Harman Kaur of the University of Michigan is a new approach to faceted search that recognizes distinct entity types related to a search query. For example, consider the task of ranking cities to move to based on your cuisine preferences in your locale, which involves several entity types: cities, types of cuisine, and local restaurants. Kaur and colleagues implemented FeedLens as an extension to Semantic Scholar (a tool for searching the academic literature), visualizing the rank of related entities (papers, authors, institutions, journals) as small inline charts associated with different queries (or lenses).

A frame from the video preview for "FeedLens: Polymorphic Lenses for Personalizing Exploratory Search over Knowledge Graphs" by Kaur and colleagues, showing an academic literature search interface augmented with inline charts. — Image credit: ACM SIGCHI https://youtu.be/Wxgazv_PrbQ

Continuing with the theme of connecting entities through data, the next project I highlight has to do with relating disparate data sources across the web, often a tedious manual affair of rectifying differences across APIs and confirming relationships in the data. The Wikxhibit project (preview | wikxhibit.org) presented by Tarfah Alrashed of MIT CSAIL leverages WikiData as a universal join table and offers a low-code approach to building rich applications around the data linkages curated by the WikiData community. This work also suggests the possibility of visualizing these relationships explicitly as part of these applications without having to write too much code.

A frame from the video preview for "Wikxhibit: Using HTML and Wikidata to Author Applications that Link Data Across the Web" by Alrashed and colleagues, showing a sample web page built with Wikxhibit that collects media assets related to a popular musician. — Image credit: ACM SIGCHI https://youtu.be/CtWq0cOLekQ

How might visualization manifest in generative design workflows?

Vivian Liu of Columbia University presented OPAL (preview), a tool that uses large language models and text-to-image prompting to generate illustrations for news articles, encompassing article keywords and tones. While the articles considered were primarily feature and opinion pieces, I was curious about the potential for semi-automated reporting about data: trends, before / after comparisons, and relationships. I recalled Gafni et al’s Make-a-Scene image-prompting tool (2022) that allows for shape input to steer image generation, as well as Coelho and Mueller’s Infomages project (2020), which introduced a technique that finds images within an image collection containing shapes that can be used to emphasize or draw attention to data trends. Combining all three techniques could be an interesting way of generating news illustrations for data stories.

A frame from the video preview for "OPAL: Multimodal Image Generation for News Illustrations" by Liu and colleagues, showing an interface that allows users to input words and select tones on the left, as well as an interface to browse generated images on the right. — Image credit: ACM SIGCHI https://youtu.be/FimeghwQZyQ

Generative design can be a fruitful and serendipitous approach to design (including creative visualization design). However, it can be difficult for designers to maintain a sense of agency and control over the parameters. Yuki Koyama of the Japan’s National Institute of Advanced Industrial Science and Technology presented an approach that frames design as a Bayesian optimization problem (preview), wherein an intelligent assistant observes an iterative design process and makes peripheral suggestions based on the current state of the design, providing the ability to incrementally “blend” suggestions with the designers’ own creations.

A frame from the video preview for "BO as Assistant: Using Bayesian Optimization for Asynchronously Generating Design Suggestions" by Koyama and colleagues, showing an instantiation of the technique in a 3D design tool, in which different sliders control the parametrization of a surface texture in the 3D model. — Image credit: ACM SIGCHI https://youtu.be/XYxl4dwMUUM

How can sonification and physicalization help people who are blind or visually impaired understand sensor data?

Interactive systems for monitoring streaming environmental and physiological sensor data are typically not accessible to people who are blind or visually-impaired. To address this gap, the University of Washington’s Venkatesh Potluri presented PSST (preview), a tool for specifying how sensor data could be sonified with continuous changes in pitch and amplitude. Beyond sonification, Potluri and colleagues demonstrated how PSST can physically print sensor data as punch cards for music boxes, thereby combining sonification and physicalization.

A frame from the video preview for "PSST: Enabling Blind or Visually Impaired Developers to Author Sonifications of Streaming Sensor…" by Potluri and colleagues, in which a person is feeding a musical punch card into a crank-based music box. — Image credit: ACM SIGCHI https://youtu.be/2tvE5n_5SKo?t=4848

How might we visualize and interact with data in extended reality?

Four of the presentation sessions were explicitly devoted to various aspects of extended reality (or XR, encompassing both virtual and augmented reality interfaces). However, XR-related projects also appeared in many of the other sessions. Several projects were affiliated with Meta Reality Labs, who have invested heavily into XR. Given Reality Labs’ platinum-level sponsorship of the conference (disclosure: my employer Tableau was a Bronze-level sponsor of UIST), the recurring XR theme was not unexpected.

Beginning with visual perception in XR, Zhipeng Li of Tsinghua University showed how color saturation and color value depth cues could help people understand the distance to 3D positions of content in AR applications (preview), complementing other depth cues such as contrast and blur while reserving color hue for encoding other information. Consider how when you look at a distant landscape, points that are farther away seem a little more blue and a little blurrier. One of the examples Li showed was a 3D scatterplot, thereby illustrating the implications for visualization in extended reality.

A frame from the video preview for "Color-to-Depth Mappings as Depth Cues in Virtual Reality" by Li and colleagues, showing a 3D point cloud in which nearer points are purple; more distant points are blue / turquoise. — Image credit: ACM SIGCHI https://youtu.be/lipwOvkvrsM?t=7617

Moving on from perception to interaction in XR, Hiroki Kaimoto of the University of Tokyo presented a bidirectional interaction between mobile AR and robotic objects (preview), in which small robots obey spatial constraints imposed upon them via a mobile augmented reality application.Conversely, the movement and positions of the small robots can also update graphical elements in the AR display. Seeing this, I could imagine controlling a physicalized unit chart, where each robot represents an entity in the data (as demonstrated by Mathieu Le Goc and colleagues in 2018) which can be repositioned and recategorized manually or through interactive controls in an AR interface.

A frame from the video preview for "Sketched Reality: Sketching Bi-Directional Interactions Between Virtual and Physical Worlds with ..." by Kaimoto and colleagues, showing examples of physical to virtual interaction and virtual to physical interaction along four dimensions: boundary constraints, geometric constraints, applied force, and dynamic collision. — Image credit: ACM SIGCHI https://youtu.be/pN7QlnHTW3A

Being able to interact naturally at fine scales in XR is challenging. Ideally, we should be able to reach out, point, gesture, and make selections with our fingers, without requiring clumsy hand-held paddle devices that are currently used with many XR headsets. Nathan DeVrio and Daehwa Kim (both of Carnegie Mellon University working with Chris Harrison) respectively presented DiscoBand (preview) and EtherPose (preview), wrist-worn devices that track arm, hand, and finger movements. The former uses two arrays of tiny depth cameras, while the latter uses two antennas measuring dielectric loading emanating from the motion of the hand. If we want to make fine selections and perform precise filtering, sorting, and aggregation operations with 3D data visualization, sensing devices such as these prototypes could be very helpful.

A frame from the video preview for "DiscoBand: Multi-view Depth-Sensing Smartwatch Strap for Hand, Arm and Environment Tracking" by DeVrio and Harrison, in which an outstretched arm wearing the sensor band on the left is represented as a 3D point cloud on the right. — Image credit: ACM SIGCHI https://youtu.be/mt0IU8SrOa8

A frame from the video preview for "EtherPose: Continuous Hand Pose Tracking with Wrist-Worn Antenna Impedance Characteristic Sensing" by Kim and Harrison, in which a line chart of antenna signals is shown on the left, reflecting the pose of a hand wearing the wrist-based antenna array on the right. — Image credit: ACM SIGCHI https://youtu.be/n7eqGbqmnwc

As an alternative to hand-held and head-mounted AR displays, acoustic levitation techniques can suspend luminous particle-based holograms in mid-air. However, until recently it was impossible to interact directly with the holograms without distorting or destroying them. Diego Martinez Plasencia of University College London demonstrated TipTrap (preview), an approach that accommodates direct manual selection and deselection of hologram particles. This could be a boon for interacting directly with 3D point cloud visualization, particularly as the 3D resolution of these acoustic levitation displays improves.

A frame from the video preview for "TipTrap: A Co-located Direct Manipulation Technique for Acoustically Levitated Content" by Jankauskis and colleagues, showing a single hand where a finger is approaching a floating constellation of particles. — Image credit: ACM SIGCHI https://youtu.be/pTS3wIFDfdw

Could reprogrammable magnetic “mixels” and heat-triggered shape-changing materials be used for dynamic data physicalization?

The UIST community also offers new ideas that could be applicable to the implementation of dynamic data physicalization. Shape-changing physical displays are often complicated electronic and mechanical devices that are energy-intensive and have limited expressivity. I want to highlight two projects that illustrate promising alternative approaches.

First, Martin Nisser of MIT CSAIL presented Mixels (preview), short for magnetic pixels. These small cubes have tiles that can selectively and dynamically couple with other tiles, allowing them to assemble into varying 3D configurations when reprogrammed with new magnetic signatures. I could imagine mixels reassembling into physical unit charts, sorting and resorting themselves according to different categorizations of the data.

Second, Justin Moon of KAIST presented ShrinkCells (preview), an efficient way to change the shape of a 3D-printed object by propagating heat across the individually-printed filaments of the object. Seeing this, I envisioned the implications for continuous or cyclical animated transitions in data physicalization, such as one that represents daily or annual climate patterns.

A frame from the video preview for "Mixels: Fabricating Interfaces using Programmable Magnetic Pixels" by Nisser and colleagues, showing four steps to assign new magnetic signatures to a set of tiles, resulting in a new magnetic affinity between the Mixels. — Image credit: ACM SIGCHI https://youtu.be/JKuCthryzhA

A frame from the video preview for "ShrinkCells: Localized and Sequential Shape-Changing Actuation of 3D-Printed Objects via Selectiv..." by Moon and colleagues, showing diagrams that explain how heat can be selectively and sequentially propagated across the cells, resulting in staged bending and collision avoidance during cell activation. — Image credit: ACM SIGCHI https://youtu.be/DplBQ_g5Uio

How can we ensure our data visualization satisfies the 3 Ps of good graphics?

I’ll conclude my recap with Barbara Tversky’s vision talk to the UIST 2022 attendees. Tversky is a professor emerita of psychology at Stanford and Columbia Universities, and her 2019 book Mind in Motion: How Action Shapes Thought should be required reading for all visualization practitioners. Her vision talk extended what she had written in her book about how we think in terms of spatial and sequential relations, and how visual tools can help or hinder this thinking. She emphasized the 3 Ps of good graphics: Production, Preference, and Performance, which prompted me to reflect on the visualization tools, the charts, and dashboards we make for others to use and read. When we make visualization tools or visualize data for others, we should consider if and how they can use these tools to produce new representations and new knowledge, if they prefer these tools to other approaches (including those delivered in other modalities), and how they use these tools to perform a task or make a decision.

Matt Brehmer

Website

Matthew Brehmer is a senior research staff member of Tableau Research in Seattle, where he specializes in creative expression with data for communication and presentation. Prior to joining Tableau, he was a postdoctoral researcher at Microsoft Research, which followed his PhD research on information visualization at the University of British Columbia. In addition to his research work, he is also dedicated to connecting visualization research with practice: co-organizing the VisInPractice event at IEEE VIS and speaking at practitioner events such as OpenVisConf, the Microsoft Data Insights Summit, and the Tableau Conference. Learn more about his work at mattbrehmer.ca and connect with him online at @mattbrehmer or at linkedin.com/in/matthewbrehmer.