The mystery of the coronavirus sequences deleted from the databases at the beginning of the pandemic


The debate on the origin of the coronavirus is rekindled after a researcher claims to have found 13 sequences of Sars-Cov-2 on Google Cloud that could help trace the time of the spillover. And that they had been canceled

(foto: Getty Images)

241 sequences of the genome of Sars-Cov-2 derived from Chinese patients during the early stages of the pandemic were deleted from the database on which they were registered. To find out, ea recover 13 from Google Cloud backups, it was Jesse Bloom, who studies the evolution of viruses at the Fred Hutchinson Cancer Research Center in Seattle, which describes his investigation in a article available on bioRxiv, pending review by other scientists. The disclosure, however, does not seem to really support no conspiracy hypothesis, and rediscovered sequences could help in rebuilding the history of the coronavirus.

The investigation

To Bloom that rambling relationship of the Health Organization (WHO) on origins of the coronavirus just wasn’t enough. Therefore he began to retrace the steps of the commission in search of first sequences of the Sars-Cov-2 genome.

He thus ran into one Chinese study which reported mutations found in sequences genomics derived from biological samples taken from patients with Covid-19 in China at the beginning of the pandemic. The sequences, not fully reported in the work, were recorded on the database Sequence Read Archive (Sra), overseen by a division of the US National Institute of Health (Nih).

In trying to trace the complete sequences directly from the database, however, Bloom realized that they were gone: had been deleted. Thanks to further investigation, however, Bloom managed to recover 13, still preserved in backup in cloud.

The mystery of the deleted sequences

Why have those sequences been removed from the database?
Contacted directly by email, the authors of the Chinese study have not yet responded, but a Nih spokesperson said the deletion was requested by the authors (which hold ownership of the data) because the same sequences were in progress update and would then be loaded onto a other database.

Bloom, however, reports of not yet been able to find them elsewhere, and considers the facts a little suspicious. That it was a cover-up? Not everyone in the scientific community seems to favor this hypothesis: after all, the virologist of the University of Utah Stephen Goldstein points out on Science Magazine, l’Chinese article it’s still available and it was for a year before the request to delete the sequences from the database came. Perhaps, being published in a minor magazine, he is alone escaped the radar of scientists.

What the rediscovered sequences tell us

Bloom’s discovery it doesn’t add much to what you already knew, or it was suspected, about the origin of the coronavirus, namely that it is very likely that Sars-Cov-2 (or a very close relative of it) was already circulating in China before December 2019 and that the wet market in Wuhan o it was not the place of the spillover or it wasn’t the only one (so much so that some of the first cases of Covid-19 had no connections). The sequences found by Bloom, in fact, do not contain three mutations typical of the version of Sars-Cov-2 found in wet market but they have similarities with the genome of the bat coronavirus identified in 2013. Perhaps, therefore, they are sequences belonging to a intermediate ring nell’evolution coronavirus that could help experts uncover his true story.

Putting aside the cover-up hypothesis, Bloom’s analysis, although it has yet to be scrutinized by other experts, seems methodologically interesting and indicates a direction to follow. There is from scan the internet: that the hunt for sequences begins lost.


Categories:   Science

Comments