ConVision Benchmark: A Contemporary Framework to Benchmark CNN and ViT Models

Shreyas Bangalore Vijayakumar; Krishna Teja Chitty-Venkata; Kanishk Arya; Arun K. Somani

doi:10.3390/ai5030056

AI (Jul 2024)

ConVision Benchmark: A Contemporary Framework to Benchmark CNN and ViT Models

Shreyas Bangalore Vijayakumar,
Krishna Teja Chitty-Venkata,
Kanishk Arya,
Arun K. Somani

Affiliations

Shreyas Bangalore Vijayakumar: College of Enginneering, Iowa State University, Ames, IA 50011, USA
Krishna Teja Chitty-Venkata: College of Enginneering, Iowa State University, Ames, IA 50011, USA
Kanishk Arya: Department of Computer Engineering and Technology, MIT World Peace University, Pune 411038, India
Arun K. Somani: College of Enginneering, Iowa State University, Ames, IA 50011, USA

DOI: https://doi.org/10.3390/ai5030056
Journal volume & issue: Vol. 5, no. 3
pp. 1132 – 1171

Abstract

Read online

Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have shown remarkable performance in computer vision tasks, including object detection and image recognition. These models have evolved significantly in architecture, efficiency, and versatility. Concurrently, deep-learning frameworks have diversified, with versions that often complicate reproducibility and unified benchmarking. We propose ConVision Benchmark, a comprehensive framework in PyTorch, to standardize the implementation and evaluation of state-of-the-art CNN and ViT models. This framework addresses common challenges such as version mismatches and inconsistent validation metrics. As a proof of concept, we performed an extensive benchmark analysis on a COVID-19 dataset, encompassing nearly 200 CNN and ViT models in which DenseNet-161 and MaxViT-Tiny achieved exceptional accuracy with a peak performance of around 95%. Although we primarily used the COVID-19 dataset for image classification, the framework is adaptable to a variety of datasets, enhancing its applicability across different domains. Our methodology includes rigorous performance evaluations, highlighting metrics such as accuracy, precision, recall, F1 score, and computational efficiency (FLOPs, MACs, CPU, and GPU latency). The ConVision Benchmark facilitates a comprehensive understanding of model efficacy, aiding researchers in deploying high-performance models for diverse applications.

Published in AI

ISSN: 2673-2688 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.mdpi.com/journal/ai

About the journal

Abstract

Keywords