IEEE Access (Jan 2024)

Multidimensional Prediction Method for Thyroid Cancer Based on Spatiotemporally Imbalanced Distribution Data

  • Zhiwei Jia,
  • Yuqi Huang,
  • Yanhui Lin,
  • Min Fu,
  • Chenhao Sun

DOI
https://doi.org/10.1109/ACCESS.2023.3347635
Journal volume & issue
Vol. 12
pp. 4674 – 4686

Abstract

Read online

In complex data environments, rational handling of unbalanced datasets is key to improving the reliability of early disease prediction. Early warning of disease risk in both temporal and spatial terms, contributes to disease prevention and treatment. To this end, a bi-dimensional substratum information mining model based on Association Rule Digging with Dynamic Thresholding and Weight Optimization (ARDdtwo) was proposed for the early diagnosis of thyroid cancer. It is an integrated assessment framework consisting of association rule digging by constructing a dynamic threshold model (ADRcdt) for qualitative analysis, and a self-optimizing component importance measurement model (SoCIM) for quantitative analysis. ARDcdt incorporates temporal and spatial features of sparse data to address the distributional bias problem. Moreover, new importance diagnostic calculations were designed to further identify high-risk low-frequency (HRLF). The SoCIM can determine the relative weight of each component by assessing its level of risk in the overall system based on the Risk Enhancement Level (REL) and Risk Reduction Level (RRL), realizing the self-adjustment and optimization of the weight setting. Finally, the model was validated through an empirical analysis. The evaluation of the research work shows that improved results were achieved, such as accuracy, f1-score, and precision, with optimized values of 36.04%,56.57%, and 53.89%, respectively. The overall area under the curve for the model was 0.882. This proves the validity of the proposed model for practical applications. For patients, it can simplify the pathological process and reduce the examination costs.

Keywords