Survey data yields improved estimates of test-confirmed COVID-19 cases when rapid at-home tests were massively distributed in the United States
Publication information:
Abstract
Importance: Identifying and tracking new infections during an emerging pandemic is crucial to
design and deploy interventions to protect populations and mitigate its effects, yet it remains a
challenging task.
Objective: To characterize the ability of non-probability online surveys to longitudinally
estimate the number of COVID-19 infections in the population both in the presence and
absence of institutionalized testing.
Design: Internet-based non-probability surveys were conducted, using the PureSpectrum
survey vendor, approximately every 6 weeks between April 2020 and January 2023. They
collected information on COVID-19 infections with representative state-level quotas applied to
balance age, gender, race and ethnicity, and geographic distribution. Data from this survey
were compared to institutional case counts collected by Johns Hopkins University and
wastewater surveillance data for SARS-CoV-2 from Biobot Analytics.
Setting: Population-based online non-probability survey conducted for a multi-university
consortium —the Covid States Project.
Participants: Residents of age 18+ across 50 US states and the District of Columbia in the
US.
Main Outcomes and Measures: The main outcomes are: (a) survey-weighted estimates of
new monthly confirmed COVID-19 cases in the US from January 2020 to January 2023, and
(b) estimates of uncounted test-confirmed cases, from February 1, 2022, to January 1, 2023.
These are compared to institutionally reported COVID-19 infections and wastewater viral
concentrations.
Results: The survey spanned 17 waves deployed from June 2020 to January 2023, with a
total of 408,515 responses from 306,799 respondents with mean age 42.8 (STD 13) years;
202,416 (66%) identified as women, and 104,383 (34%) as men. A total of 16,715 (5.4%)
identified as Asian, 33,234 (10.8%) as Black, 24,938 (8.1%) as Hispanic, 219,448 (71.5%) as
White, and 12,464 (4.1%) as another race. Overall, 64,946 respondents (15.9%) self-reported
a test-confirmed COVID-19 infection. National survey-weighted test-confirmed COVID-19
estimates were strongly correlated with institutionally reported COVID-19 infections (Pearson
correlation of r=0.96; p=1.8 e-12) from April 2020 to January 2022 (50-state correlation
average of r=0.88, SD = 0.073). This was before the government-led mass distribution of at-
home rapid tests. Following January 2022, correlation was diminished and no longer
statistically significant (r=0.55, p=0.08; 50-state correlation average of r=0.48, SD = 0.227). In
contrast, survey COVID-19 estimates correlated highly with SARS-CoV-2 viral concentrations
3The copyright holder for this preprint this version posted May 22, 2024. ; https://doi.org/10.1101/2024.05.21.24307697 doi: medRxiv preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.
in wastewater both before (r=0.92; p=2.2e-09) and after (r=0.89; p=2.3e-04) January 2022.
Institutionally reported COVID-19 cases correlated (r = 0.79, p=1.10e-05) with wastewater viral
concentrations before January 2022, but poorly (r = 0.31, p=0.35) after, suggesting both
survey and wastewater estimates may have better captured test-confirmed COVID-19
infections after January 2022. Consistent correlation patterns were observed at the state-level.
Based on national-level survey estimates, approximately 54 million COVID-19 cases were
unaccounted for in official records between January 2022 and January 2023.
Conclusions and Relevance:
Non-probability survey data can be used to estimate the temporal evolution of test-confirmed
infections during an emerging disease outbreak. Self-reporting tools may enable government
and healthcare officials to implement accessible and affordable at-home testing for efficient
infection monitoring in the future.