Why we need to think more about how we show and share data about COVID-19
In the months from March to December 2020, the landscape of data visualizations describing the COVID-19 pandemic has expanded and become a fixture of everyday life for many. Major newspapers, Google, federal and state governments, universities, pop-up alliances, and independent data scientists have set up online resources that visualize summary and predictive figures pertaining to the COVID-19 pandemic.
But this landscape can be uneven and fraught with pitfalls. If visuals command more attention as people attempt to make sense of the pandemic, their risk, and the status of their immediate and regional communities, then the stakes of data visualizations about COVID-19 are greatly elevated.
We should recognize at the outset that visualizing data on a short timetable in a high-pressure situation is difficult at any time, let alone a pandemic. That said, there are several areas where there can be an improvement in the clarity of purpose of individual charts and increasing opportunities for audiences to make meaningful comparisons across timeframes, regions, and personal activities. Our team has reviewed nearly 100 different dashboards from governmental, academic, journalistic, and independent sources. We have evaluated them according to their transparency, design, accessibility, and explanatory value.
Additionally, we paid close attention to how graphics and summary statistics have defined the pandemic. Many dashboards emphasized deaths, hospital beds, positive cases, and location. But this is a sparse representation of a complex problem with complex impacts. Dashboards cannot represent the whole complexity of what people face during the COVID-19 pandemic, but oftentimes a too-narrow scope can confuse the reason for the dashboard existing in the first place.
This concern for scope relates to the purpose of data dashboards for COVID-19. If a dashboard only reports deaths, cases, and hospital beds, or emphasizes summary data above all other things, it begs the question of what decision or behavior the graphics are supposed to support? What message are dashboards like this trying to send?
In the absence of a specified purpose, dashboards can appear to be generic surveillance that falls back on the idea that “the data speaks for itself.” But data never speaks for itself and is subject to layers of interpretation, assumptions, and rhetoric. Presenting data in general, summarizing ways without linking it to specific decisions or problems, allows audiences to mistake dashboards as more comprehensively representative than they really are.
And mistaking the grand-scale view offered by a dashboard of maps and figures increases our misunderstanding. Presenting incomplete data without calling great attention to how incomplete the data is; sweeping reporting practices that created the data beneath a more appealing graphic; showing maps when we should be showing networks, or presenting raw counts uncontrolled by population; small omissions or invisible decisions accumulate to make graphics a bane to any sense of collective understanding.
If these graphics fall short, or if audiences cannot or do not interpret them in ways that eliminate confusion and distrust of institutions, we have to return to the question of why is it necessary to visualize data of the COVID pandemic in the first place? An emphasis on merely “reporting” the data can too often occlude the follow up considerations of “for whom” and “for what purpose?”