Scientists call for fully open sharing of coronavirus genome data

Hundreds of scientists are urging that SARS-CoV-2 genome data should be shared more openly to help analyse how viral variants are spreading around the world.

Researchers have posted huge numbers of SARS-CoV-2 genome sequences online since January 2020. The most popular data-sharing platform, called GISAID, now hosts more than 450,000 viral genomes; Soumya Swaminathan, the chief scientist at the World Health Organization (WHO), has called it a “game changer” in the pandemic. But it doesn’t allow sequences to be reshared publicly, which is hampering efforts to understand the coronavirus and the rapid rise of new variants, argues Rolf Apweiler, co-director of the European Bioinformatics Institute (EBI) near Cambridge, UK, which hosts its own large genome database that includes SARS-CoV-2 sequences.

“The openness of SARS-CoV-2 sequence data is crucial for the rapid response against the biggest health threat to humankind in a very, very long time,” says Apweiler. 

A letter relased on on 29 January, Apweiler and others call for researchers to post their genome data in one of a triad of databases that don’t place any restrictions on data redistribution: the US GenBank, the EBI’s European Nucleotide Archive (ENA) and the DNA Data Bank of Japan, which are collectively known as the International N

ucleotide Sequence Database Collaboration (INSDC).

Anyone can anonymously access the INSDC’s data and use them as they want, but GISAID requires

 that users confirm their identity and agree not to republish the site’s genomes without permission from the data provider. This means that studies building on GISAID data — such as those that create evolutionary trees analysing how SARS-CoV-2 variants are related — can’t publish full data so that others can easily check their analyses or further build on their data set. They must direct readers back to the GISAID site.

The letter says the scientific community should “remove barriers that restrain effective data sharing”, but doesn’t mention GISAID specifically. It is signed by more than 500 scientists, including the 2020 chemistry Nobel laureate Emmanuelle Charpentier, and the head of the COVID-19 

Data visualisation of the genomes of the 56 fully sequenced isolates of the virus SARS-CoV-2

Genomics UK Consortium, Sharon Peacock. Where scientists have already established submissions to other databases, the letter states, “these submissions should continue in parallel”.

