Description of scope, methodology, and findings
In the year 2020, we are awash in graphics that attempt to explain the and visualize the coronavirus pandemic. Since the pandemic has taken hold in the United States, there has been a renaissance of maps, charts, and dashboards created by major media outlets, states department of health, and scientists.
With data about the virus a daily part of life, what can we learn about the role and circulation of data visualizations in a pandemic? Our challenge is to understand the variety, popularity, and consequences of the data visualizations that have come to represent the impact of the virus. These questions will help to derive lessons learned from the initial swell of pandemic graphics to ensure that data visualization can be part of collective resilience and adaptive capacities. Done well, data visualization can help communicate risk, demonstrate the impact of collective action, and inform decisions to produce an equitable and resilient society. Done thoughtlessly, they make us less resilient, feeding misinformation, confusion, or obfuscation.
What were some of the patterns demonstrated across some of the most popular data graphics during covid 19?
How might we extract principles from these and other examples when trying to learn what data graphics for resilience might need to look like?
Shawn Walker, School for Social and Behavioral Sciences, ASU West
Gracie Valdez, Watts College of Public Affairs, ASU
Tanush Vinay, School of Computing, Informatics, and Decision Systems Engineering, ASU
Pauras Jadhav, School of Computing, Informatics, and Decision Systems Engineering, ASU
Our project had three major threads of activity: 1) Image analysis of graphics collected from Twitter; 2) A survey of Covid-19 graphics produced by major media outlets such as the Washington Post and New York times; 3) A review of all 50 states COVID-19 dashboards. For 1), our team collected tweets that used the #flattenthecurve hashtag, a trending topic on Twitter from March until June 2020. We created a database of over one million tweets. That database was subjected to a classification algorithm that identified data graphics (maps, charts, and infographics) and set them apart from other kinds of images to create a collection of tweets that contained data graphics (either by posting or retweeting). From that collection, were left with image files to create a dataset of data graphics on which to conduct further study. Those graphics were used in a clustering routine, which grouped each chart in a “family” of similar charts that looked nearly the same. We also set up a comparison against a random selection of one million tweets from the general #covid19 hashtag from Twitter spanning a similar time frame. Clustering and comparisons were used to create a set of general observations for all data collected through Twitter. The workflow for 1) can be found in Figure 1.
For 2), we inspected the websites and web archives of major news publication websites, including New York Times, Washington Post, Economist, Wall Street Journal, and more. For this part of the project, we were looking specifically at the use of maps and time series plots to communicate the spread of COVID-19 infections in the United States. We created an archive of images captured from these sites and used them to make general observations about prominent data graphics. The images collected can be found here: https://www.dropbox.com/home/KER%20Data%20Visualization%20Project/Observations%20%26%20Notes/Visuals%20-%20Media%20Outlets
For 3), we surveyed the COVID-19 response dashboards hosted by all 50 states. We created an evaluation framework to call attention to the software, graphic types, and types of statistics used by each. This created a 50-state overview that we analyzed to produce some of our general observations and recommendations. The full review can be found here: https://www.dropbox.com/home/KER%20Data%20Visualization%20Project/Observations%20%26%20Notes?preview=State+Health+Departments.xlsx
Furthermore, it was clear from our survey of media and state resources that maps were a mainstay of visualizing covid trends, even if there wasn’t a clearly spatial aspect of the epidemic. For instance, although no websites we surveyed explicitly indicated that geographic proximities of one county to another or one state to another may elevate the risk of transferring covid across borders, nearly every site we looked at used state and county level maps to show the distribution of cases. State websites seemed to reproduce the presentation style of one another, sometimes down to the software-level, with all fifty states drawing from fewer than 10 platforms to build their resources, creating a kind of visualization monoculture.
Ultimately, maps and trendlines abound in the data we collected from every source, but these visualizations and accompanying summary statistics did little to speak to the needs and decisions made by laypersons on a daily basis. Maps with swelling or darkening portions showing virus everywhere; lines spiking up or flattening; each obliquely engages the risks faced by people but provides little nuance or decision support. What’s worse, these habits of visualizing the pandemic inadvertently make the profoundly large numbers associated with the pandemic more abstract and less associated with the consequences of policy or collective behavior.
Our deliverables for the project were a submission to a special issue of a journal (A Special Issue of International Communication Association), a series of guide documents pertaining to visualization design and COVID-19, and an open-source document further developing some observations and recommendations from the team. We also have an accepted article for Slate describing some of our work, and we produced a podcast describing some of the complex issues surrounding state dashboards for epidemics.
This program allowed us to stage a broad-based investigation of data visualizations during the COVID-19 pandemic. Our research gets us a better understanding of collective mores of data visualization in crisis, allowing us to gain an unparalleled perspective on tendencies for risk communication, as well as opportunities for improvement.
How we visualize harm, risk, and the consequences of collective decisions is a critical part of adaptation and resilience. Data visualization allows for new decision making, and helps viewers sort through data that would be impossible to approach without externalizing through visuals. Employed deliberately and with the full knowledge of what does and does not work is an important way to make sure that complex data can matter to decision makers.
We hope that our work will influence future practices in visualizing data from disasters and other widespread threats to public health. In the short term, we hope to better inform audiences so they can be more deliberate in what they recognize and take away from data visualizations about the pandemic.
Our work is use-inspired and seeks to join disciplines such as sociology, information science, computer science, design, and public policy.
Which of the characteristics of community resilience does this project draw from or help build?
Visualizations are part of how we leverage large dynamic data sets, and better visuals contribute to better self-regulation of communities and organizations.