Scientific result | Mathematics | Simulation ＆ modelling | Virus

# A statistical and dynamic approach to better understand the limits of COVID-19 epidemiological models

*Published on 21 January 2021*

- Asymptotic estimates of SARS-CoV-2 infection counts and their sensitivity to stochastic perturbation, Chaos
- Modeling the second wave of COVID-19 infections in France and Italy via a stochastic SEIR model, Chaos

As the COVID-19 epidemic began to spread throughout the world, scientists from several countries (France, the United Kingdom, Mexico, Denmark and Japan) – including Davide Faranda from the LSCE – expressed concern about the diversity of approaches taken up by epidemiologists.

The description of an epidemic naturally requires a physical model of its evolution, but also a set of health data for initialization, such as the data provided by the *Center for Systems Science and Engineering* at the Johns Hopkins University in the United States (daily number of new infections, *etc.*).

By using COVID-19 data from different countries, the researchers show that the various predictions are extremely sensitive to the reporting protocol and crucially depend on the last available data point before the maximum number of daily infections is reached.

They propose a physical explanation for this sensitivity, using a *Susceptible-Exposed-Infected-Recovered* model that divides the population into four groups: those who are susceptible to catching the virus; those who have contracted it but show no symptoms; those who are infected; and finally, those who have recovered or died from the virus. In order to determine how people move from one group to another, fundamental parameters like the infection rate, incubation time, and recovery time must be known.

The scientists randomly perturb these parameters to simulate the variability in the detection of patients, in the containment measures taken by different countries, and in the evolution of virus characteristics or the presence of super-spreaders. Their results, which are not specific to COVID-19, suggest that there are physical and statistical reasons to assign low confidence to purely statistical predictions, despite their apparently good scores.

*"There is a great amount of uncertainty in the fundamental parameters. In particular, the incompleteness of infections data mean that the models can produce incredibly divergent results," *notes Davide Faranda, LSCE researcher. *"For example, underestimating the number of infected people by 20% can change the final estimates by a few thousand to a few million sick individuals."*

In particular, the researchers have shown that near real-time predictions of COVID-19 infections vary greatly depending on the last available data point. The dynamics of the epidemic are therefore extremely sensitive to the model's parameters during the initial growth phase. This means that the predictions made at the beginning of an epidemic wave lack the expected robustness to define the most appropriate protective measures.

Moreover, uncertainties resulting from a combination of poor data quality and inadequate estimates of the parameters (rates of incubation, infection and recovery) propagate and amplify in longer-term extrapolations.

The approach taken by the researchers allowed them to conduct several scenarios describing the propagation of COVID-19 in France and Italy in which, despite great uncertainties, the threat of a second wave was clearly detectable as early as the spring of 2020.

The following **health data from the Johns Hopkins University** were used by the researchers:

https://systems.jhu.edu/research/public-health/ncov/

https://github.com/CSSEGISandData/COVID-19

In addition, the article ** Modeling COVID-19 data must be done with extreme care, scientists say** can be read on the website Phys.org.