Universe (Sep 2023)
Random Forest Classification and Ionospheric Response to Solar Flares: Analysis and Validation
Abstract
The process of manually checking, validating, and excluding data in an ionospheric very-low-frequency (VLF) analysis during extreme events is a labor-intensive and time-consuming task. However, this task can be automated through the utilization of machine learning (ML) classification techniques. This research paper employed the Random Forest (RF) classification algorithm to automatically classify the impact of solar flares on ionospheric VLF data and erroneous data points, such as instrumentation errors and noisy data. The data used for analysis were collected during September and October 2011, encompassing solar flare classes ranging from C2.5 to X2.1. The F1-score values obtained from the test dataset displayed values of 0.848; meanwhile, a more detailed analysis revealed that, due to the imbalanced distribution of the target class, the per-class F1-score indicated higher values for the normal data point class (0.69–0.97) compared to those of the anomalous data point class (0.31 to 0.71). Instances of successful and inadequate categorization were analyzed and presented visually. This research investigated the potential application of ML techniques in the automated identification and classification of erroneous VLF amplitude data points; however, the findings of this research hold promise for the detection of short-term ionospheric responses to, e.g., gamma ray bursts (GRBs), or in the analysis of pre-earthquake ionospheric anomalies.
Keywords