Evaluating Vision-Language Models for hematology image Classification: Performance Analysis of CLIP and its Biomedical AI Variants

Tanviben S Patel; Hoda El-Sayed; Md Kamruzzaman Sarker

doi:10.23919/FRUCT64283.2024.10749850

Proceedings of the XXth Conference of Open Innovations Association FRUCT (Nov 2024)

Evaluating Vision-Language Models for hematology image Classification: Performance Analysis of CLIP and its Biomedical AI Variants

Tanviben S Patel,
Hoda El-Sayed,
Md Kamruzzaman Sarker

Affiliations

Tanviben S Patel: Bowie State University
Hoda El-Sayed: Bowie State University
Md Kamruzzaman Sarker: Bowie State University

DOI: https://doi.org/10.23919/FRUCT64283.2024.10749850
Journal volume & issue: Vol. 36, no. 1
pp. 578 – 584

Abstract

Read online

Vision-language models (VLMs) have shown remarkable potential in various domains, particularly in zero-shot learning applications. This research focuses on evaluating the performance of notable VLMs—CLIP, PLIP, and BiomedCLIP—in the classification of blood cells, with a specific emphasis on distinguishing between normal and malignant (cancerous) cells datasets. While CLIP demonstrates robust zero-shot capabilities in general tasks, this study probes its biomedical adaptations, PLIP and BiomedCLIP, to assess their effectiveness in specialized medical tasks, such as hematological image classification. Additionally, we investigate the impact of prompt engineering on model performance, exploring how variations in prompt construction influence accuracy across these biomedical datasets. Extensive experiments were conducted on a variety of biomedical images, including microscopic blood cell images, brain MRIs, and chest X-rays, providing a comprehensive evaluation of the VLMs' applicability in medical imaging. Our findings reveal that while CLIP, trained on general datasets, performs well in broader contexts, PLIP and BiomedCLIP—optimized for medical imagery—demonstrate enhanced accuracy in medical settings, particularly in hematology. The results underscore the strengths and limitations of these models, offering valuable insights into their adaptability, precision, and potential for future applications in medical image classification.

Published in Proceedings of the XXth Conference of Open Innovations Association FRUCT

ISSN: 2305-7254 (Print); 2343-0737 (Online)
Publisher: FRUCT
Country of publisher: Finland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication
Website: http://fruct.org/publication

About the journal

Abstract

Keywords