Diagnostics (Sep 2024)
Optimal Training Positive Sample Size Determination for Deep Learning with a Validation on CBCT Image Caries Recognition
Abstract
Objectives: During deep learning model training, it is essential to consider the balance among the effects of sample size, actual resources, and time constraints. Single-arm objective performance criteria (OPC) was proposed to determine the optimal positive sample size for training deep learning models in caries recognition. Methods: An expected sensitivity (PT) of 0.6 and a clinically acceptable sensitivity (P0) of 0.5 were applied to the single-arm OPC calculation formula, yielding an optimal training set comprising 263 carious teeth. U-Net, YOLOv5n, and CariesDetectNet were trained and validated using clinically self-collected cone-beam computed tomography (CBCT) images that included varying quantities of carious teeth. To assess performance, an additional dataset was utilized to evaluate the accuracy of caries detection by both the models and two dental radiologists. Results: When the number of carious teeth reached approximately 250, the models reached the optimal performance levels. U-Net demonstrated superior performance, achieving accuracy, sensitivity, specificity, F1-Score, and Dice similarity coefficients of 0.9929, 0.9307, 0.9989, 0.9590, and 0.9435, respectively. The three models exhibited greater accuracy in caries recognition compared to dental radiologists. Conclusions: This study demonstrated that the positive sample size of CBCT images containing caries was predictable and could be calculated using single-arm OPC.
Keywords