IEEE Access (Jan 2024)
Data-Driven Network Connectivity Analysis: An Underestimated Metric
Abstract
In the network structure analysis, we explore an underestimated key metric, the Relative Size of Largest Connected Component (RSLCC) and demonstrate its importance in post-disaster network connectivity assessment. RSLCC was first investigated in the study of complex network structures but remains largely unexplored in terms of analysis within a specific application domain such as scenarios in transportation networks, wireless networks, communication networks, power networks, etc. Through the research presented in this paper, we not only prove that this metric is underestimated, but also design 7 methods to predict the value of this metric, with a Deep Neural Network (DNN) prediction accuracy of more than 99%. This study focuses on the assessment and analysis of post-disaster network connectivity, by exploring how the RSLCC, a key metric of network connectivity, can be used to efficiently predict and assess network connectivity in a disaster scenario, specifically, the approximate network connectivity value can be predicted simply by knowing the number of connected edges in the pre-disaster network and the number of connected edges in the post-disaster network. To achieve this, firstly, a sufficiently large-scale 100,000 datasets containing the values of attributes related to the network structure is prepared. Secondly, based on the preprocessing of the data, principal component analysis and variance contribution analysis are carried out, and the metric with the highest contribution to the principal component is approximated as the network connectivity. The next step is the prediction process, Network Disruption Degree (NDD) is chosen as the independent variable. since it is best to choose an extremely simple metric as the independent variable for prediction, rather than all network structure-related metrics, this paper demonstrates that it is possible to get satisfactory prediction results with this metric. It is found that NDD prediction methods have the highest prediction accuracy but take the longest run time and require training data of a sufficiently large size. If the prediction is done in small-size data, then Random Forest Regression (RFR) is proven to have the highest prediction accuracy. Although the network connectivity metric proposed in this paper is only an approximation, it provides good directions for simplifying the network connectivity analysis and the use of this metric for the study of practical modelling problems is also highly interpretable.
Keywords