When cases of COVID-19 began popping up in Washington state in late February, researchers were quick to dive into the genetics of the viruses infecting residents. Based on what they knew at the time, they hypothesized that those cases in late February were genetically linked to the very first case found in the state—one in a person who arrived in Washington on January 15 after traveling from Wuhan, China, where the outbreak began. The case was also the first infection identified in the whole of the United States.
If correct, the genetic hypothesis linking the late February cases to that very first case meant that early efforts to contain the pandemic coronavirus—isolating the initial patient, tracing contacts, etc.—had failed spectacularly. It also meant that the virus, SARS-CoV-2, had been cryptically circulating in the state for six weeks. And that would mean that, in addition to those early cases, there were potentially hundreds or thousands of others out there, undetected and possibly spreading the infection further.
The hypothesis played into state officials’ decision to issue some of the country’s earliest social-distancing measures. But now that we know far more about the genetics of circulating SARS-CoV-2 viruses, that hypothesis appears to be wrong.
At this point, researchers now have the full genetic blueprints of more than 25,000 SARS-CoV-2 viruses isolated from patients. And with all that accumulated knowledge, a new analysis suggests that the cases detected in Washington in late February were not linked to that first case in January. Rather, the Evergreen State’s February outbreak was likely sparked by additional introductions of SARS-CoV-2 strains. The analysis pegs the igniting introduction to February 13, two weeks before the cases became apparent.
An important note: the new analysis is a preprint, which means it has not yet been published in a scientific journal or gone through the standard scientific vetting of peer review.
That said, several researchers accept the findings—generally, at least. And the analysis flows into a growing stream of cautions and corrections about genetic analyses done early in an emerging outbreak.
Specifically, the type of analyses in question are phylogenetic, i.e., looking at genetic variation and evolution of individual organisms to understand how they relate to each other. All COVID-19 victims are infected with the novel coronavirus, SARS-CoV-2, of course. But not all SARS-CoV-2 viruses are exactly the same. When a person is infected, SARS-CoV-2 viral particles get into cells of the body and hijack the cells’ molecular machinery to make viral copies of themselves, which can then go on to infect someone else. In all the frenetic viral cloning, there can be small errors in the copies of genetic code. And as virus particles move from person to person to person, the viral code can drift, accumulating new sets of mutations or picking up other variations.
By looking at the genetic sequences of many SARS-CoV-2 viruses and identifying the differences between them, geneticists can try to work out how they all relate to each other and how the virus is evolving overall. In doing so, researchers can also help understand the chains of transmission from person to person. For instance, if two patients are infected with viral strains that have very different sets of mutations, it’s unlikely one of them infected the other. If the viruses are nearly identical, it’s possible—though not certain—the infections are linked.
As for what those mutations might mean: sometimes they’re meaningless, sometimes they’re not. In the worst cases, they could allow a virus to cause more severe disease, for instance, or make it better at spreading between people.
So far, there’s no firm evidence that SARS-CoV-2 is picking up scary mutations. In fact, SARS-CoV-2 doesn’t seem to be evolving much at all. The scores of genetic sequences researchers have been gathering show that SARS-CoV-2 has a relatively slow rate of evolution.
That may be good news for humanity in general, but it is definitely bad news for making sense of phylogenetic data. The slow evolution means that many of the viruses look very similar to each other, making it hard to untangle how exactly they’re related. That’s what led researchers to suspect that the February cases of COVID-19 in Washington state were linked to the first case in January.
In the new analysis—led by Michael Worobey, an expert in the evolution of pathogens at the University of Arizona—researchers took a close look at the genetic data that has accumulated since the end of February. They note that, although the SARS-CoV-2 from the first Washington case, dubbed WA1, is very similar to the other SARS-CoV-2 viruses found in the state, they differ by two nucleotides—represented by letters in their genetic code.
“[D]espite hundreds of genomes sequenced in Washington state, no viruses with genomes identical to WA1 or transitional between it and the outbreak clade (i.e. having a C at position 17747 or an A at position 17858, like WA1) had been sampled there,” the authors wrote.
To take things further, Worobey and colleagues set up simulations with the latest estimate for SARS-CoV-2’s evolutionary rate—the rate at which it picks up such changes. In all their simulations seeded with WA1, none resulted in viral strains with the genetic changes similar to what was seen in the real outbreak viruses. Other modeling, however, strongly supported the possibility that additional introductions of SARS-CoV-2 in Washington from China led to the outbreak in February. They estimated that another case arrived in the state on February 13, with a possible range of February 7 to February 19.
Though the analysis still needs thorough peer review, other researchers seemed to accept the general conclusion that additional introductions of SARS-CoV-2 likely led to Washington’s outbreak—not the initial case in January.
That includes Trevor Bedford, the researcher who first hypothesized that WA1 was the source of the February outbreak in the state. Bedford, an expert in viral evolution at the Fred Hutchinson Cancer Research Center in Seattle, conceded on Twitter that his hypothesis has not held up.
“At this point, I don’t think we can exclude a WA1 introduction as an impossibility,” he wrote in a long thread on the topic. “[B]ut I think the most likely scenario is a separate undetected introduction by a genetically identical virus or a virus possessing the additional 17858G mutation.”
Bedford went on to say that he still estimates that there was a large, undetected outbreak at the time outbreak cases surfaced in late February. “Thus, I believe I was wrong in the original assessment of a WA1 introduction, but correct in asserting significant community spread in Washington State on Feb 29,” he wrote. “I’m sorry to have created confusion here. Although I do think that my original actions were warranted given available evidence at the time.”
William Hanage, an epidemiologist and expert of pathogen evolution at Harvard, also agreed with the general takeaway of the analysis by Worobey and colleagues. “It is now pretty clear that there were more introductions” in Washington, he wrote in a Twitter thread responding to Bedford’s comments.
Hanage and colleagues made a cautionary comment on just this issue in Nature Microbiology on May 19, before Worobey’s analysis appeared online. Hanage and his colleagues wrote:
Most viruses can be separated into strains and, if two infections are caused by dissimilar strains, one can rule out transmission. The often-forgotten point is that phylogenies can rule out transmission, but if infections are caused by the same strains or identical viruses it does not definitively prove transmission. During an emerging outbreak when pathogens have not yet diverged into different strains, phylogenetic information is too weak to hypothesize transmission linkage…
Though the link between WA1 and Washington’s subsequent outbreak may be flimsy, there may be a silver lining in all of this—that Washington state officials acted while thinking it might have a massive, hidden outbreak. As Worobey and co-authors note, Washington was “among the first in the US to initiate restrictions on social distancing and size of gatherings.”
“One irony,” they wrote, was that said action was “founded at least in part on an assumption about the timing of community transmission not supported by the phylogenetic data (i.e. the belief that cryptic transmission had been ongoing since mid-January). This action may have closed the gap between the onset of sustained community transmission and mitigation measures in Washington state, compared to other locales like New York City, in ways that deserve careful reevaluation.”
On March 22, when New York issued a stay-at-home order, the state was seeing over 17 newly confirmed cases a day per 100,000 people, according to IHME data. In April, that number hit as high as 52 new cases a day per 100,000.
On March 23, when Washington issued its stay-at-home order, the state was seeing about three new cases a day per 100,000—and the figure has not reached above eight amid aggressive testing.