Model Soups for Various Training and Validation Data

Kaiyu Suzuki; Tomofumi Matsuzawa

doi:10.3390/ai3040048

AI (Sep 2022)

Model Soups for Various Training and Validation Data

Kaiyu Suzuki,
Tomofumi Matsuzawa

Affiliations

Kaiyu Suzuki: Department of Information Sciences, Tokyo University of Science, Yamazaki, Chiba 278-8510, Japan
Tomofumi Matsuzawa: Department of Information Sciences, Tokyo University of Science, Yamazaki, Chiba 278-8510, Japan

DOI: https://doi.org/10.3390/ai3040048
Journal volume & issue: Vol. 3, no. 4
pp. 796 – 808

Abstract

Read online

Model soups synthesize multiple models after fine-tuning them with different hyperparameters based on the accuracy of the validation data. They train different models on the same training and validation data sets. In this study, we maximized the model fine-tuning accuracy using the inference time and memory cost of a single model. We extended the model soups to create subsets of k training and validation data using a method similar to k-fold cross-validation and trained models on these subsets. First, we showed the correlation between the validation and test data when the models are synthesized, such that their training data contain validation data. Thereafter, we showed that synthesizing k of these models, after synthesizing models based on subsets of the same training and validation data, provides a single model with high test accuracy. This study provides a method for learning models with both high accuracy and reliability for small datasets such as medical images.

Published in AI

ISSN: 2673-2688 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.mdpi.com/journal/ai

About the journal

Abstract

Keywords