Novel Coronavirus Circulated Undetected Months Before First COVID-19 Cases in Wuhan
A new study dates emergence of the virus that causes COVID-19 to as early as October 2019. Simulations also suggest that in most cases, zoonotic viruses die out naturally before causing a pandemic.

University Communications and University of California San Diego
March 18, 2021

Using molecular dating tools and epidemiological simulations, researchers at the University of Arizona, University of California San Diego School of Medicine and Illumina, Inc., estimate that the SARS-CoV-2 virus was likely circulating undetected for at most two months before the first human cases of COVID-19 were described in Wuhan, China, in late December 2019.

In the March 18 online issue of the journal Science, they also note that their simulations suggest that the mutating virus dies out naturally more than three-quarters of the time without causing an epidemic.

"A lot has been learned over the last year about this pandemic, but one of the most important questions of all has remained unanswered: When exactly did the outbreak begin?" said co-corresponding author Michael Worobey, professor and head of the University of Arizona Department of Ecology and Evolutionary Biology.

"Our study was designed to answer the question of how long could SARS-CoV-2 have circulated in China before it was discovered?" added senior author Joel O. Wertheim, associate professor in the Division of Infectious Diseases and Global Public Health at UCSD.

"To answer this question, we combined three important pieces of information: a detailed understanding of how SARS-CoV-2 spread in Wuhan before the lockdown, the genetic diversity of the virus in China and reports of the earliest cases of COVID-19 in China. By combining these disparate lines of evidence, we were able to put an upper limit of mid-October 2019 for when SARS-CoV-2 started circulating in Hubei province," Wertheim said.

Cases of COVID-19 were first reported in late December 2019 in Wuhan, located in the Hubei province of central China. The virus quickly spread beyond Hubei. Chinese authorities cordoned off the region and implemented mitigation measures nationwide. By April 2020, local transmission of the virus was under control, but by then COVID-19 was pandemic with more than 100 countries reporting cases.

SARS-CoV-2 is a zoonotic coronavirus, believed to have jumped from an unknown animal host to humans. Numerous efforts have been made to identify when the virus first began spreading among humans, based on investigations of early-diagnosed cases of COVID-19. The first cluster of cases – and the earliest sequenced SARS-CoV-2 genomes – were associated with the Huanan Seafood Wholesale Market, but study authors say the market cluster is unlikely to have marked the beginning of the pandemic because the earliest documented COVID-19 cases had no connection to the market.

Regional newspaper reports suggest COVID-19 diagnoses in Hubei date back to at least November 17, 2019, suggesting the virus was already actively circulating when Chinese authorities enacted public health measures.

In the new study, researchers used molecular clock evolutionary analyses to try to home in on when the first, or index, case of SARS-CoV-2 occurred. The molecular clock technique uses the mutation rate of genes to deduce when two or more lifeforms diverged – in this case, when the common ancestor of all variants of SARS-CoV-2 existed, estimated in this study to as early as mid-November 2019.

Molecular dating of the most recent common ancestor is often taken to be synonymous with the index case of an emerging disease. 

However, said Worobey, "the index case can conceivably predate the common ancestor – so the actual first case of this outbreak may have occurred days, weeks or even many months before the estimated common ancestor. Determining the length of that 'phylogenetic fuse' was at the heart of our investigation."

Based on this work, the researchers estimate that the median number of persons infected with SARS-CoV-2 in China was less than one until November 4, 2019. Thirteen days later, it was four individuals and just nine on December 1, 2019. The first hospitalizations in Wuhan with a condition later identified as COVID-19 occurred in mid-December. 

Study authors used a variety of analytical tools to model how the SARS-CoV-2 virus may have behaved during the initial outbreak and early days of the pandemic when it was largely an unknown entity and the scope of the public health threat not yet fully realized.

These tools included epidemic simulations based on the virus's known biology, such as its transmissibility and other factors.  In just 29.7% of these simulations was a virus with the properties of SARS-CoV-2 able to create self-sustaining epidemics. In the other 70.3%, the virus infected relatively few persons before dying out. The average failed epidemic ended just eight days after the index case.

"Typically, scientists use the viral genetic diversity to get the timing of when a virus started to spread," Wertheim said. "Our study added a crucial layer on top of this approach by modeling how long the virus could have circulated before giving rise to the observed genetic diversity.

"Our approach yielded some surprising results. We saw that over two-thirds of the epidemics we attempted to simulate went extinct. That means that if we could go back in time and repeat 2019 100 times, two out of three times, COVID-19 would have fizzled out on its own without igniting a pandemic. This finding supports the notion that humans are constantly being bombarded with zoonotic pathogens."

Wertheim noted that even as SARS-CoV-2 was circulating in China in the fall of 2019, the researchers' model suggests it was doing so at very low levels until at least December of that year.

"Given that, it's hard to reconcile these low levels of virus in China with claims of infections in Europe and the U.S. at the same time," Wertheim said. "I am quite skeptical of claims of COVID-19 outside China at that time."

The original strain of SARS-CoV-2 became epidemic, the authors write, because it was widely dispersed, which favors persistence, and because it thrived in urban areas where transmission was easier. In simulated epidemics involving less dense rural communities, epidemics went extinct 94.5 to 99.6% of the time.

The virus has since mutated many times, with a number of variants evolving to become more transmissible.

"In addition to clarifying when that first person became infected in China, our work shows just what an immense challenge it is to nip in the bud a pandemic caused by a virus like SARS-CoV-2," Worobey said. "There would have been so few cases in those early weeks, with a large proportion of those cases not even showing symptoms. We are going to have to up our game dramatically if we hope to block future pandemics like this one."

Study co-authors include Jonathan Pekar and Niema Moshiri of UCSD and Konrad Scheffler of Illumina, Inc.


Resources for the media

Media contact(s)

Daniel Stolte

Science Writer, University Communications

Researcher contact(s)

Michael Worobey

Department of Ecology and Evolutionary Biology