Post #624: How many people are spreading this, now, in Fairfax County?

Posted on April 12, 2020

Methods:  The best estimate I could find says that people who are infected with coronavirus walk around for three or four days, in an infectious state, before their symptoms emerge (reference).  And, at least one model of the Wuhan epidemic suggested that there had to be a considerable number of people who were infected, but whose cases were never reported.

Based on that, and the recent number and trend of case counts, I came up with the estimate above.  That’s the last four day’s worth of new cases, adjusted for the recent average daily growth (to bring it up to today), and then a fudge factor added to account for cases mild enough that the infected individual never gets tested (and so does not appear the case counts).

I’ll discuss that last one below.  How many unrecorded cases of coronavirus are there, really?  And how on earth could you know?

So, take this estimate with a grain of salt.  You see all kinds of wild numbers out there.  With this one, at least you know how I came up with it.

And if it’s anywhere near ballpark, it provides you with plenty of reason to take all possible precautions when in public, around others.

Under the worst-case assumptions, if the Pan Am Safeway gets 2000 customers a day, they’re now at the point where they might have eight infectious customers per day, walking through that store.


The biggest fudge factor in that estimate:  unreported cases.

The count of cases that are being tested and discovered is hard data.  The average incubation period for the disease is fairly well known.

So the biggest fudge factor in that estimate is the number of cases that exist, but are unreported.  That’s also one of the big fear factors for this epidemic.  You’ll see all kinds of crazy numbers for that.  I’d like to take a minute to explain why I used a factor of four (four unreported cases for every reported case) as my worst-case scenario.

For Wuhan, I’ve seen one scholarly estimate that “up to 86%” of cases go unreported.  But how could they know?

That’s based on this computer simulation.  In other words, they looked at how fast the virus spread, the number of known cases at any one point in time, and the assumed “reproductive number”, that is, the number of cases that each person directly infects. Based on that, and based on their assumptions, the researchers concluded that there had to be a whole lot of cases out there that were busy infecting people, but were not being reported.

So their estimate was that 86% of Wuhan cases were unreported.  Not based on measuring anything.  Based on comparing the observed rapid spread of the disease, to their modeled spread.  Based on the assumptions behind their model.

But.  They took the best information available at the time on the reproductive number “R0” for COVID-19.   That’s the number of people that the average person directly infects.  They used a value of 2.38.  But I’m also pretty sure that the most recent research, here in the US, suggests that the virus is far more infectious than they thought.

It’s hard to say whether that’s an actual shift in our understanding, or that’s merely due to behavioral difference between the two nations.  The Chinese mask up during epidemics.  We, by contrast ,have to be told to do that.  That cultural difference may or may not account for the apparent discrepancy between older research and more modern research on this disease.

Anyway, my understanding is that a more updated value for “R0” is more like five-ish.  Right now, the CDC only says that this appears to spread more effectively than flu.

That also was a study of the very start of the epidemic, where both testing and reporting were poor.  That’s not an estimate for the later stages of the epidemic, when they knew what this was, and were testing for it heavily.

And so, the people who came up with the estimate that 86% of cases weren’t being counted, had this logic:  They saw rapid spread, with few reported cases, and from that inferred the presence of many unreported cases.  But, alternatively, it might just be that the cases that were reported were far more infectious than their model assumed.

Of course, the fraction who are unreported is going to depend on the rate of testing.  But, despite what you hear, and despite the desire for more testing, we really haven’t been slouches in Virginia.  Based on our Department of Health website, Virginia has tested almost 40,000 individuals, compared to the count of cases of just over 5,000.

Finally, the World Health Organization report on Wuhan said that truly asymptomatic cases were rare.  So the idea that there are a lot of people walking around who never knew they were sick, but could infect others, that’s unlikely.  There might be a few.  But that’s not what’s driving the infection rate.

And so, using what may be (in hindsight) a low estimate of how infectious COVID-19 is,  that one piece of research suggested that the known cases were only 14% of the total.  I’m pretty sure that would be challenged now, based on updated information on that basic infection parameter.  And it’s not clear that the situation they studied has any direct relevance to the current situation in the US.

So, I’m going to round that, and just use an estimate of 4 “hidden” cases for every known cases as my worst-case scenario.

For my best case scenario, frankly, I don’t have a clue.  I’d like to re-run that computer simulation, using a much higher primary reproduction factor, but I have no idea how to do that.  So I make a not-implausible lowball guess of one un-counted case for every counted case.  There’s no more science to it than that.

For a different opinion, here’s a seemingly reasonable article that says there are about two undocumented cases for every documented one.  Here’s another, more thorough analysis, that comes out in the range of just 1.5 undocumented cases for every documented one.

On the other hand, if you haven’t had your fix of fearmongering yet today, read this one.  My take on that is that a) it dates to an era of far more restrictive testing than we have now and b) seemed like more of a round-numbers guess than anything actually worked out in detail.

So why do I bring it up?  Because the “hidden” cases is one of the biggest fear-generators for this epidemic.  Don’t believe every number you read.  And check the date of the analysis, because things have moved on pretty quickly.

The upshot is, I’m reasonably comfortable that my range of best case to worst case probably includes the “true” number.  Whatever it is.  Doesn’t mean that I’ve covered every guess that’s ever gotten into the newspapers.  But of the reasonably sober analyses of it, yeah, I think I have that covered.


Contact tracking:  CDC investigation of one early cluster of cases

If you want to get an intuitive grasp of how this disease spreads, look at the CDC’s detailed study of one cluster of cases in ChicagoThis cluster of 16 cases was the result of one guy who felt a little ill, but not badly enough to keep him from attending a funeral and a birthday party.

This is taken from:  Ghinai I, Woods S, Ritger KA, et al. Community Transmission of SARS-CoV-2 at Two Family Gatherings — Chicago, Illinois, February–March 2020. MMWR Morb Mortal Wkly Rep. ePub: 8 April 2020. DOI: http://dx.doi.org/10.15585/mmwr.mm6915e1external icon.

mm6915e1 Community Transmission of SARS-CoV-2 at Two Family Gatherings — Chicago, Illinois, February–March 2020

To read this:  The time axis is on the bottom.  The box at the left — that’s the guy.  Every horizontal line is somebody who get infected by this — either directly, by that guy, or by the people he infected.

Black boxes are times of death, gray boxes are times of hospitalization.  The distance between the gray and black boxes on any line represents the length of hospital stay.

All the rest of the boxes are people who got the disease, but not badly enough to be hospitalized.  Solid boxes means confirmed by lab test.  Dashed boxes means they are pretty sure they had it, but that’s based on symptoms alone.

On any given horizontal line, the distance between where that line starts, and blue box, is the amount of time that person was walking around, already infected but unaware of it.

Based on cases like this, the CDC is (or at least, was) fairly convinced that the disease was largely spread by close contact with symptomatic individuals.  But now, if you go to their website, they have softened their position a bit, based on newer information.  Now they are willing to say that this is a new disease, no one is entirely sure about how it can spread, and there are some studies showing that people can spread it even if they don’t (yet) have symptoms.

As a final note, there is a lot of talk about people who never have symptoms, but can spread the disease.  Based on the Chinese experience, a) people who get infected but don’t have symptoms are thought to be quite rare, and b) if so, they’d eventually stop spreading the disease as they rid themselves of the virus.  (How did the Chinese find infected individuals if they didn’t have symptoms?  The routinely tested everyone in the family of those who were infected.)  The upshot is that if you catch this from a person without symptoms, that’s almost always going to be a pre-symptomatic individual.  Somebody who was infected, but who is not yet showing symptoms.

If you look above day 14, the dashed box there is the fourth generation of infection, starting with the guy.  Case A4.1.  That person just happened to live in the same house, as somebody who was being cared for, by somebody who attended a birthday party, that was attended by that guy.

When the CDC talks about this being spread by close contact with infected individuals, this is what they are talking about.  This is why, when you’re outside the home, among others, you have to keep your distance and wear a mask.  Not just because you want to avoid becoming one of those boxes.  But also because, you may be one of those boxes and not yet know it.