Heliyon (May 2024)
A novel approach for identification of zoonotic trypanosome utilizing deep metric learning and vector database-based image retrieval system
Abstract
Trypanosomiasis, a significant health concern in South America, South Asia, and Southeast Asia, requires active surveys to effectively control the disease. To address this, we have developed a hybrid model that combines deep metric learning (DML) and image retrieval. This model is proficient at identifying Trypanosoma species in microscopic images of thin-blood film examinations. Utilizing the ResNet50 backbone neural network, a trained-model has demonstrated outstanding performance, achieving an accuracy exceeding 99.71 % and up to 96 % in recall. Acknowledging the necessity for automated tools in field scenarios, we demonstrated the potential of our model as an autonomous screening approach. This was achieved by using prevailing convolutional neural network (CNN) applications, and vector database based-images returned by the KNN algorithm. This achievement is primarily attributed to the implementation of the Triplet Margin Loss function as 98 % of precision. The robustness of the model demonstrated in five-fold cross-validation highlights the ResNet50 neural network, based on DML, as a state-of-the-art CNN model as AUC >98 %. The adoption of DML significantly improves the performance of the model, remaining unaffected by variations in the dataset and rendering it a useful tool for fieldwork studies. DML offers several advantages over conventional classification model to manage large-scale datasets with a high volume of classes, enhancing scalability. The model has the capacity to generalize to novel classes that were not encountered during training, proving particularly advantageous in scenarios where new classes may consistently emerge. It is also well suited for applications requiring precise recognition, especially in discriminating between closely related classes. Furthermore, the DML exhibits greater resilience to issues related to class imbalance, as it concentrates on learning distances or similarities, which are more tolerant to such imbalances. These contributions significantly make the effectiveness and practicality of DML model, particularly in in fieldwork research.