A scientist from Seattle has uncovered the mysteriously deleted coronavirus sequence from the earlier days of the outbreak in Wuhan.
It is believed that the novel coronavirus originated from a seafood market in Wuhan, China. However, since the start of the pandemic, multiple theories have come forward regarding its origin. Many believe the virus may have leaked from a bio lab either deliberately or accidentally. However, a WHO-China joint report has debunked that theory. Now, scientist Jesse Bloom’s detective skills seem to have further intensified the virus’s origin story. The researcher managed to recover the deleted coronavirus sequence from some of the earliest COVID-19 cases reported in Wuhan.
Jesse is part of the Fred Hutchinson Cancer Research Center in Seattle where he studys evolution of viruses. While investigating the origin of coronavirus, the researcher had come across a study from May 2020. In the study, the authors had listed down 241 genetic sequences of SARS-CoV-2, collected as part of a Wuhan University project, that they had supposedly uploaded to an online database. He then searched the online database, Sequence Read Archive (SRA) for the 241 sequences. The SRA, maintained by the National Institutes of Health (NIH) is a permanent archive of all deep sequencing data. However, Jesse’s search brought up no results.
Putting his sleuthing skills to test, Jesse then began to dig a little deeper. His search led him to a March 2020 study by Aisu Fu and Renmin Hospital of Wuhan University. The study had reported genetic sequences of 45 nasal swab samples collected from outpatients suspected of COVID-19 during the early days of the outbreak. However, these sequences were nowhere to be found on SRA. Jesse believes the deleted sequences came from these samples.
The Plot Thickens
Through Google Cloud, Jesse was then able to recover some of the deleted files and reconstruct 13 of the deleted sequences. However, these sequences seem to lack mutations present in samples collected from Wuhan’s seafood market outbreak.
According to the NIH, they had received a request from the investigators for deletion of their uploaded sequences. And as per standard practice, the NIH obliged. Notably, the study was released at a time when China’s State Council had issued an order that all COVID-19 related publications first pass a central approval. However, it is unclear as to why the investigators would want the sequences deleted. Jesse has since emailed both corresponding authors but has yet to hear back from them.
Although Jesse’s findings don’t point to the novel coronavirus originated from a lab, it does add evidence to the fact that the virus was circulating in Wuhan earlier than the cases from the seafood market. Piecing together genetic sequences of early samples can help researchers identify the progenitor strain and thus find the virus’s origin.
Jesse reported his findings on the pre-print server, bioRxiv.
Farkas, C., F. Fuentes-Villalobos, J. L. Garrido, J. Haigh, and M. I. Barría, 2020 Insights on early mutational events in SARS-CoV-2 virus reveal founder effects across geographical regions. PeerJ
Wang, M., A. Fu, B. Hu, Y. Tong, R. Liu, et al., 2020a Nanopore target sequencing for accurate and comprehensive detection of SARS-CoV-2 and other respiratory viruses. medRxiv 10.1101/2020.03.04.20029538.