Post #689: Statistical comparison of states re-opening and not, first revisit

Posted on May 13, 2020

As of the 5/11/2020 data, there remains no material difference between these two groups of states in terms of their trend in COVID-19 cases. That said, it’s still probably too soon to expect to see any difference.

Detail follows.

In Post #685, I took the New York Times definition of which states were re-opening and which were not, and set up a comparison between them.

On average, the “re-opening” states were far more rural, and had a far lower average case load (COVID-19 cases per capita).  And , by the luck of the draw (or perhaps because the virus cannot distinguish Republicans from Democrats), the two sets of states had near-identical percentage growth rates in their COVID-19 cases.

That set up an obvious “pre-post-with-control” statistical comparison.  Taking May 1 as “the opening date”, if re-opening materially increases transmission of disease, we ought to see those trends begin to diverge some time after that.  So it’s a pre-and-post May 1 comparison between the early-reopening states (the “treatment” group) and the late-reopening states (the “control” group).

There are, of course, many caveats.

First, any difference should take quite a while to show up.  That’s for a number of reasons, including but not limited to:

  1. The disease incubation period (mean five days from infection to symptom onset);
  2. Reporting lag (at least five days, probably longer, on average, between symptom onset, seeking medical help, and having the case enter the official counts);
  3. A generally “soft”, partial-and-measured re-opening, where only a subset of business re-opened, at something less than capacity, and with a less-than-enthusiastic average response from the public.
  4. Some staggering of the opening dates — some of these states began re-opening before May 1, some states began a few days after May 1.
  5. Churches — in my mind, a likely source of infection spread — typically only meet weekly, and many appeared unprepared to resume services on 5/1/2020.

So why did I run the numbers, if I didn’t really expect to see a result?  I wanted to be sure that I didn’t see a result.  In other words, this initial run-through was more-or-less a check on the methods.  I’d like to be sure that I don’t see results, where I shouldn’t see results.

Second, it’s entirely possible that this analysis will never show results.  Again, for lots of good reasons, including but not limited to:

  1. At some point, all the states will re-open businesses, so I lose my comparison group.
  2. States may cheat if the numbers turn against them.  I hear that Florida has already started changing the numbers, excluding nursing homes and other “outbreaks” from their data.
  3. The variation in how states go about this may add so much random “noise” that it may swamp any systematic differences.
  4. Variation in the extent to which state populations embrace the re-opening of business, ditto.

So why do this at all?  Because any reasonably systematic approach beats the baloney you are going to read in various news services.  Newspaper reporters naturally focus on click-bait — the outliers, the worst cases, and the eye-catching numbers.  Fox News and similar can be relied upon to suppress anything that appears to question this policy.  And so on.

Can this analysis be improved?  Sure, and I may try to do that, by looking at the nuances of just exactly when, where, and how states re-opened businesses.  But that’s a lot of work.  And, frankly, unless there’s some whopping big effect at the us-versus-them level, there will probably be no impact on people’s thinking, anyway.

What I think will be the most interesting is whether there will be even one state, or even one large city, that has to backtrack on their re-opening policy.  Given the wide variability of the disease, of the transmission routes, and so on, you’d certain expect that there would be.

If not, that suggests that there are common-but-unknown factors that are affecting this, outside of explicit public health policy.  Weather, phases of the moon, wrath of the gods, who knows.

Or, my favorite, the likelihood of aerosol (airborne) transmission being the single largest remaining source of new infections outside of the residential setting.  Because if that’s the case, nobody has warned the population to take appropriate precautions against that.  (E.g., keep the windows open.)  We all remain equally ignorant and unprepared, thanks to the CDC’s unwillingness to admit to the likelihood of aerosol (airborne) transmission of this disease.