Here's What's Wrong With COVID-19 Case Counts

— How base rate fallacy may be clouding the picture of disease activity in an Indiana college town

by Robert Hagen, MD November 7, 2020

Last Updated November 9, 2020

There's certainly no denying the severity of COVID-19 in the U.S., but the numbers of positive tests reported can lead to confusion – especially for those of us in university towns.

Most of us in healthcare have a fairly good understanding of math but are not nuanced in the field of statistics. Unfortunately, the lack of understanding of the statistical principle of base rate fallacy/false positive paradox has led to some confusing numbers.

A classic 1978 article in the New England Journal of Medicine reveals this problem. The researchers asked 60 Harvard physicians and medical students a seemingly simple question: If a test to detect a disease with a prevalence of 1/1,000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease?

Only 14% gave the correct answer of 2% with most answering 95%.

Base rate fallacy/false positive paradox is derived from Bayes theorem. When the incidence of a disease in a population is low, unless the test used has very high specificity, more false positives will be determined than true positives. The difference in the numbers can be quite striking and certainly not inherently understandable.

We have learned in the past from routine PSA testing and mammograms that a positive test in a screening situation needs to be taken in context. The incidence of a disease in the population that you are testing is extremely important for accuracy.

Purdue University made the decision in late spring to resume in-person classes for its fall session. Purdue is a major research university with a strong emphasis on STEM education. Many of these classes include practicums, laboratory sessions, and group projects that require some in-person attendance.

An elaborate plan was implemented, including a signed pledge from all students to behave properly, wear masks, maintain social distancing. A decision was made to perform random testing on 10% of the students and staff each week. Since staff and students combined are 50,000 at Purdue University, 5,000 tests are done every week. The purpose of the random testing was surveillance to encourage students and staff to maintain proper behavior.

The Indiana State Department of Health advised against a random testing program, as it felt overall data accuracy would be difficult. Commingling of data in our county from the people tested WITH symptoms together with the randomly tested Purdue students WITHOUT symptoms has occurred. Base rate fallacy/false positive paradox unfortunately becomes ignored when one does this.

Up to this point, Purdue has done random testing on about 1,000 students per weekday. Of those, about 35 are positive each day, according to the university's dashboard. Students who test positive have to isolate in an old dormitory or go home. Those who choose to go home will often have another test by their personal physician. When these tests return negative, significant confusion occurs.

So far, 90% of the students who test positive do not develop symptoms. Only one has been hospitalized and none have died. Had Purdue chosen to test all 50,000 students and staff every week, 10 times the number would have reported as testing positive weekly. Had this data been commingled with testing of symptomatic individuals, there certainly would have been an outcry by the casual observer to close everything down again. Yet those numbers would be only representative of the positivity of mass testing, not the prevalence of infective patients.

Those 35 students who test positive daily are added to our county totals (many of those county positive tests are done on people with COVID-19 symptoms). Thus, it makes it look like our county's number of positive tests has doubled since Purdue started in-person classes in August.

The numbers have caused our county health department to move cautiously. Restaurant occupancy, sporting events and other large gatherings are again limited at a greater level than state requirements.

Without knowing the specificity of the test, the number of these positives that are false positives is unknown.

By base rate fallacy/false positive paradox, if the specificity of a test is 95%, when used in a population with a 2% incidence of disease -- such as healthy college students and staff -- there will be 5 false positives for every 2 true positives. (The actual incidence of active COVID-19 in college age students is not known but estimated to be less than 0.6% by Indiana University/Fairbanks data. Even using a test with 99% specificity with a 1% population incidence generates 10 false positives for every 9 true positives.

Using the same test on patients with COVID-19 symptoms, because their incidence of disease is 50% or greater, the test does not have to be perfect. Even using a test with only 90% specificity, the number of false positives will be much less significant.

The actual sensitivity and specificity of COVID-19 tests are unknown as these tests were okayed by the FDA under Emergency Use Authorization. Manufacturers' data have not yet been corroborated by the agency.

The tests are "good enough" for diagnosing patients with symptoms but not nearly as effective when used for a random testing program.

By not reporting these groups separately, we really have no idea what's going on in our town. Luckily, Purdue keeps their own dashboard and with some calculations their data can be extracted from the county data to give us a ballpark guess. Also because of additional testing being available, Indiana is now performing at times 40,000 COVID tests per day. Eight weeks ago, Indiana was performing 20,000 tests per day. Our state has a population of 6.5 million. By those increased numbers of testing, 4% of our Indiana population is now being tested for COVID-19 every week.

Purdue has discussed using a serial testing protocol. Antigen tests will be used on the random population with subsequent confirmatory PCR tests used for anyone who initially tests positive. This should decrease the number of overall false positives and hopefully will prevent so many from being quarantined.

Certainly positivity rates are going up here. Contact tracers are telling positive testers who have nowhere to isolate to be evaluated at their hospital emergency room. Could this be the reason for increased hospitalizations? As of a week ago, our two local hospitals with a combined 350 beds had 18 patients admitted with a COVID diagnosis. COVID deaths in Indiana average about 23 per day, but that too is going up.

So it's all very confusing. Ideally, testing those WITH symptoms would be reported separately from those randomly being tested WITHOUT symptoms.

Contact traced people identified as being close to a COVID patient WITH symptoms (>10% incidence of testing positive for COVID) would also be another category and those identified by contact tracing who were near a person who tested positive WITHOUT symptoms (>1% incidence of having COVID) would be a fourth.

Throw all those four groups in together if you want, but just understand you are not getting a true picture of what is going on. We must compare apples to apples and oranges to oranges rather than just making fruit salad out of the whole thing. Bad decisions can be made because of a misunderstanding of statistics.

Robert Hagen, MD, is recently retired from Lafayette Orthopaedic Clinic in Indiana. He's an adjunct professor at Indiana University, a past president and board member of the Indiana Orthopaedic Society, and a past member of the Board of Councilors for the American Academy of Orthopaedic Surgeons.