We’re wrapping up one of the most unique months of our lives, and indeed one of the most unique months of global history. As I’ve been trying to wrap my head around news of the COVID-19 pandemic, I found it challenging to contextualize many of the numbers. I couldn’t find a source that compared multiple views of the same information, which I often find is the best way to understand a dataset.
So, I made my own visualizations from publicly-available data. To be clear, I have little background in anything related to the modeling of infectious diseases, and as a result I do not try to draw much out of the data. However, I have found these visualizations helpful in my own understanding of the spread of the pandemic.
I’d also be remiss to not mention some of the many caveats with this data and analysis: the number of reported cases obviously does not include the large number of unreported cases and is dependent on the levels of testing in that country; many people have questioned the accuracy of reported numbers from some countries; and country-level views of population density miss important factors related to local population density and transmissibility.
The data was obtained from the ECDC on April 1, 2020. The selected countries were chosen fairly arbitrarily based on how frequently I’ve seen them in the news. All code is contained in this GitHub repository; the primary analysis notebook is here. The interactive plots were created using Altair.
Of course, note that the number of reported cases is not the number of actual cases. The number of reported cases varies widely depending on testing practices and other factors.
- China’s officially reported numbers have encountered widespread skepticism. My understanding is that this is attributed to the combination of an overall low number of cases and the changing definitions of a COVID-19 case. This skepticism certainly does not seem unwarranted given the odd shape of the curve and its low magnitude.
- The number of cases in all selected countries except China and South Korea is still rapidly increasing.
- Japan has a shockingly low number of cases, given how often I heard about its outbreak.
- The US growth is very steep…it seems obvious to me that it’s way too early to consider relaxing restrictions.
Total cases, normalized
- Using China’s officially reported numbers, the number of cases barely registers when looking at the number of cases in the population as a whole.
- Spain and Italy stand out for their high density of cases, despite not having aggressive testing.
- Iran has notably slower growth than the other countries with an exponentially-growing number of cases.
Total cases, normalized, starting with day of first reported case
I’m not sure if “days since first reported case” is quite the right approach (as opposed to days until some critical mass of cases is achieved), but I was curious to see the results.
- The trajectories of Italy and Spain are very close.
- Again, Iran’s growth is notably slower, which is especially surprising given how early its outbreak occured.
- Using the date of the first reported case as the starting point, the number of cases in the US takes off the latest relative to other countries.