Open Science in corona times

Joachim Gassen heads the Open Science Data Center of the TRR 266. He discusses three projects the Center has recently launched in response to the Covid-19 pandemic. 


The core objective of the Open Science Data Center of the TRR 266 “Accounting for Transparency” is to promote the benefits of open science. We see a clear shift in the interest of both the academic community and the public: the Covid-19 pandemic, the effect of governmental interventions, as well as effects on the economy and society at large. The role of open science is to provide input on these matters in a way such that others can collaborate and contribute. Normally, these others would mostly be fellow academics. However, given the general interest in the topic, currently a wide audience is interested to understand and assess Covid-19 related data. This situation is an important opportunity for open science to shine.

So: How do we address this need when we are mostly busy setting up the data infrastructure for the TRR 266 and wrangling data on corporate disclosures? Given the expertise of the TRR 266 on regulatory interventions, data handling, and visualizations we decided that the Open Science Data Center of TRR 266 could help best by working on three interrelated projects:

  1. Provide a code-based and maintained infrastructure for retrieving publicly available country-level data on the spread of the Covid-19 pandemic, on related governmental interventions and on behavioral effects of these interventions.
  2. Assess and compare the data quality of publicly-available governmental intervention data sources.
  3. Provide a visualization tool to the general public that allows everybody to independently assess the development of the pandemic, showcasing how design choices affect the message of data visualizations.

The outputs from these projects differ from “traditional” research outputs in paper form as the idea of open science is to make research available and reusable. Also, we wanted to act quick so that fellow researchers can make use of our efforts for their own projects.

PROJECT 1: R package “tidycovid19”

For the first project, we published the R package “tidycovid19”. It allows easy download of Covid-19 related data from various sources. Among other sources, the data includes country-day data on the spread of the pandemic, the two main sources for government intervention data and mobility data provided by Google and Apple. Some of these data come in already well-structured formats. Other data needs more care until it can be used. And then there is data that was being provided in PDF format with the most relevant data hidden in line graphs. The package provides functions to download, clean, and merge all these data sources into a combined and ready to use country-day level dataset. A series of blog posts on my personal website discusses the data and coding approach, allowing others to not only use the package, but also use its code base for inspiration:

Read more about the package here

The complete code base of the package is available on GitHub


PROJECT 2: Assessing publicly-available governmental intervention data sources

As the TRR 266 has documented expertise in studying regulatory interventions, our key focus with regards to Covid-19 is documenting regulatory interventions and their effect. Luckily, there are several initiatives that systematically collect interventions data at a global level. The two largest ones are:

  • the data repository hosted by ACAPS, an NGO focusing on providing independent information on humanitarian issues) and
  • government response tracker set up by the Blavatnik School of Government at the University of Oxford.

Assessing and comparing the data sources of ACAPS and the Blavatnik School of Government is informative, since they organize their data differently, resulting in varying levels of data quality. Assuring data quality is a key research topic in open science and of central interest in the Covid-19 setting, given the timeliness and fundamental importance of these data repositories. Again, we have documented our work in a blog post, to be found on my personal website. The post includes code to verify our analyses. We are currently actively working with the data providers to help developing the data infrastructure on Covid-19 related governmental interventions.


PROJECT 3: Visualization Tool

While the first two projects target a scientific audience in order to support research that focuses on the role of governmental interventions in the Covid-19 pandemic, the last project targets a wider audience. It is comforting to see that academic research, data and expertise are in high regard by the public in these times. However, data driven assessments require experience. The spread of the virus is often communicated in visual displays (see the coverage of the Financial Times as a central example). However, these displays have staggering degrees of freedom. Consequently, by setting these strategically, an analyst can communicate very different messages. To provide some insights into this matter and to allow the interested audience to design their own visuals of the pandemic spread, we developed, as part of the R package mentioned above, an interactive display. It allows users to explore the spread of the Covid-19 pandemic across various dimensions. A related blog post on my personal website discusses the issue of “visualization degrees of freedom”. It showcases some examples on how one can communicate very differently based on the same underlying data.

Explore the display here


In combination, these three projects contribute to the rapidly growing open science community that engages the Covid-19 pandemic. It is comforting and uplifting to see that research can act fast to tackle pressing issues. We, at the Open Science Data Center of TRR 266, are happy to contribute our tiny bits to a big and challenging puzzle.


To cite this blog:

Gassen, J. (2020, April 21). Open Science in corona times, TRR 266 Accounting for Transparency Blog.


