Image source: Source: Grainger Industrial
What I’m suggesting is that proposed vaccines that have passed Phase I (safety) testing ought to be tested, right now, in places where large oubreaks are common. While those outbreaks are happening. Places like prisons, meat-packing plants, nursing homes, hospitals, and other high-risk environments.
If nothing else, if these vaccines have some effectiveness, you’d be doing the workers and inmates in those situations a favor. But in addition, there’s a good, solid arithmetic behind doing such acid tests: You’d know your answer to “does this vaccine work” far faster than you would using standard Phase II drug testing, which relies on community members to volunteer, and then be randomized into treatment and control groups. And that would be doing us all a favor.
The only hard part is how to do it in a way that actually generates a usable test of the vaccine. And even that part isn’t rocket science. It just has to skirt an ethics question and get some cooperation among state, federal, and private-sector entities.
How might this work?
You probably can’t do a standard two-arm research design here, if you start by selecting facilities with a severe problem. That is, randomize individuals within each facility, give half the vaccine, half a placebo, and then compare the infection rate between the two halves. Presumably, people in these high-risk situations will be counting on getting some protection from the vaccine. In fact, they might only agree to participate because of that. I think it would be judged unethical to promise a vaccine, and deliver a placebo to half of them in this situation. Or possibly unworkable to walk into a high-risk situation and say that you might have some chance of getting the real vaccine. Or maybe not.
Instead of randomizing the individuals within a facility into treatment and control (placebo) groups, you’d randomize facilities into treatment and no treatment groups. This approach of taking large “chunks” of individuals at a time is known generically as a “clustered sample” approach. You take “clusters” of individuals with something in common (work at the same factory, inmates at the same local jail), and build your statistical analysis up from there.
Here’s how I envision this working.
As soon as a state public health agency caught wind of an outbreak, the Feds would be notified. The Feds would flip a metaphorical coin, and either would or would not send in the US Public Health Service (US PHS) to offer vaccination to every individual in that facility who has not already contracted coronavirus.
This would be in addition to whatever the facility was doing, or was mandated to do under state law. So, typically, these employers end up testing every employee. So they’d do that, as is SOP by now. And then they would be given the offer of vaccine, for every employee not testing positive for the virus.
If they accept, they’d have to agree to re-testing their workforce in (say) 90 days. And for facilities not chosen to receive vaccine, you’d need to get some estimate of that same 90-day-later infection rate. You could plausibly do that, I think, by selecting a random sample of affected individuals, and asking them to be tested for coronavirus in 90 days. If nothing else, given how unbelievably expensive these vaccine trials are, you could offer them $100 each for getting re-tested.
Obviously, you’d keep this blinded. So the US PHS personnel administering the vaccine would not know which vaccine (of the top three candidates, say) they had administered. Neither would the individuals receiving the vaccine. Nor the individuals analyzing the resulting data.
Let’s say you got Idaho to cooperate with this. State public health officials would use their existing criteria to identify outbreaks, the same way that (say) Virginia uses its criteria. If they had ten outbreaks in jails, five would be notified of an offer, from the US PHS, to administer vaccine to all non-infected inmates and personnel in those jails. That would happen, and 90 days later you’d measure the infection rate in that population, and the infection rate in a randomly-chosen subset of the population in the five jails not vaccinated.
In some sense, the results would be unique to the Idaho jail population. But I don’t think that would stop me from doing this, due to the major advantage laid out below: You’d need far fewer people, and would get an answer in far less time.
The key bit of arithmetic.
Here’s what got me thinking along these lines. The US-sponsored vaccine (Moderna) has a 30,000 person Phase II trial under way. It’s going to take a long time to get that many people enrolled. And, to the point, because they are drawing them from the community — where infection rates are not that high — it’s going to take months of enrollment before they’re going to have enough cumulative exposure to say much of anything about their vaccine.
Interestingly, Atrazenaca’s Phase II trial only has about one-third the number of participants as the Moderna trial. They have just over 10,000. I bet I can guess why that is: They are far more confident that the vaccine will have a large effect. The larger the expected effect, the smaller the sample size required to document that effect.
For the sake of argument, let’s say they manage to get 3000 people enrolled in the first month, split into 1500 treatment and 1500 control (placebo). If they do this trial in an average US state, how many cases of COVID-19 would they expect to see, in the first month, for that first cohort?
Right now, in Virginia, we are seeing roughly 1000 new COVID-19 cases per day. That works out to about 12 cases per 100,000 residents per day. So, if you had a placebo group of 1500 individuals, drawn from random Virginians, in the first month, you’d expect to observe (1500 people x 30 days x 12 cases/100,000 people) = 5 COVID-19 infections (rounded).
Let’s say the vaccine is 60% effective, as the best seasonal flu vaccine is. Is that enough cases to find a “statistically significant” impact? No, not even close, by my calculation. The difference between counts of 5 infected individuals (placebo) and maybe 2 (vaccine) is far too small to conclude anything.
Let’s add another 3000 participants in the second month. By the end of the second month, does that sample have enough statistical power to tell that the 60% effective vaccine has any impact whatsoever? Again, by my calculation, no. The “t-statistic” is just below the level at which you can say that the vaccine has any impact whatsoever. Let alone home in on how large that impact is.
It’s only after the third month, adding 3000 new participants per month, that you would be able to say that the vaccine has some (i.e., non-zero) impact. Using the standard “95% confidence interval” approach. (That is, only 5% chance that you’d see the infection rate you are seeing, in the vaccine group, purely by chance.)
By contrast, let’s look at recruiting 3000 people from various facilities where the average infection rate is 20% per month. Now how many days do you have to wait to achieve a t-statistic over 2.0? With that high infection rate, it would take far less than a month to do that. After one month, you’d have (an expected) 300 infections in the placebo group (1500 x 20%), 120 infections in the vaccinated group (1500 x 20% x .6, under the assumption it’s 60% effective), and you’d have your answer. The difference in those counts is far more than enough to allow you to conclude that the vaccine works. With a t-statistic greater than 9.0
There are some nuances here. This facility-at-a-time approach is a “clustered design”, in that you haven’t randomly chosen individuals, you’ve randomly chosen groups of individuals (e.g., an entire factory or jail at a time). Factors that are common to the people in a cluster can result in loss of considerable statistical power. But as long as you pick a large number of relatively small clusters (or, equivalently, randomly sample a few individuals to be analyzed within each cluster, for a large number of clusters), that loss of “statistical power” can be kept to something fairly modest. Under the latter approach, you’d end up vaccinating far more than 1500 individuals, but you’d resample the data (or, equivalently, reweight the data) so that the number of clusters (separate sites) is large relative to the total number of individuals in the sample.
But – surprise, surprise — the mainstream clinical trials of the vaccines are also clustered sample designs. Just with larger clusters. And using within-cluster randomization to remove most of the common effects of the cluster. But still, starting from clusters (cities) chosen for their high current infection rates.
You have undoubtedly read that these vaccine researchers are chasing after COVID-19 “hot spots”. An out-of-control epidemic, as in South Florida, or Savannah, Georgia is exactly what they are looking for, as places for testing their vaccines. And this arithmetic tells you why. The higher the underlying infection rate, the fewer the person-months of exposure you need, in order to be able to say something about your vaccine.
All I’m doing here is taking that to the obvious and logical conclusion. Those folks are doing clustered samples, but their “cluster” is a city. I.e., they aren’t randomly selecting individuals across the USA, but instead, are deliberately targeting cities with high infection rates. Then setting up treatment and control (placebo) groups within those.
So, in the end, this post is just a suggestion that they do it more efficiently by using high-risk facilities with outbreaks as the clusters, instead of high-risk cities with outbreaks. You’d get your answer a lot sooner, with a lot fewer people. And you’d be doing the workers, inmates, and residents of those facilities a favor.