Comparison of classical, xgboost and neural network methods for parameter estimation in epidemic processes on random graphs
Ágnes Backhausz,
Edit Bognár,
Villő Csiszár,
Damján Tárkányi,
András Zempléni
Affiliations
Ágnes Backhausz
ELTE Eötvös Loránd University, Budapest, Hungary, Faculty of Science, Department of Probability Theory and Statistics, Pázmány Péter sétány 1/c, Budapest, H-1117, Hungary; HUN-REN Alfréd Rényi Institute of Mathematics, Reáltanoda utca 13-15., Budapest, H-1053, Hungary; Corresponding author at: ELTE Eötvös Loránd University, Budapest, Hungary, Faculty of Science, Department of Probability Theory and Statistics, Pázmány Péter sétány 1/c, Budapest, H-1117, Hungary.
Edit Bognár
ELTE Eötvös Loránd University, Budapest, Hungary, Faculty of Science, Department of Probability Theory and Statistics, Pázmány Péter sétány 1/c, Budapest, H-1117, Hungary
Villő Csiszár
ELTE Eötvös Loránd University, Budapest, Hungary, Faculty of Science, Department of Probability Theory and Statistics, Pázmány Péter sétány 1/c, Budapest, H-1117, Hungary
Damján Tárkányi
ELTE Eötvös Loránd University, Budapest, Hungary, Faculty of Science, Department of Probability Theory and Statistics, Pázmány Péter sétány 1/c, Budapest, H-1117, Hungary
András Zempléni
ELTE Eötvös Loránd University, Budapest, Hungary, Faculty of Science, Department of Probability Theory and Statistics, Pázmány Péter sétány 1/c, Budapest, H-1117, Hungary
The main goal of this paper is to quantitatively compare the performance of classical methods to XGBoost and convolutional neural networks in a parameter estimation problem for SIR epidemic spread. Since we model the underlying social network by flexible two-layer random graphs, we can also study how the structural difference between the graphs in the training set and the test set influences the error of the estimate. We also quantify the improvement of the results when additional information (such as the average degree of infected vertices) is available, compared to the case when only the time series of the number of susceptible and infected individuals is observed. Furthermore, the simulation results show how the accuracy of the methods varies with the time elapsed from the start of the epidemic.