Optimizing Rare Disease Gait Classification through Data Balancing and Generative AI: Insights from Hereditary Cerebellar Ataxia

Dante Trabassi; Stefano Filippo Castiglia; Fabiano Bini; Franco Marinozzi; Arash Ajoudani; Marta Lorenzini; Giorgia Chini; Tiwana Varrecchia; Alberto Ranavolo; Roberto De Icco; Carlo Casali; Mariano Serrao

doi:10.3390/s24113613

Sensors (Jun 2024)

Optimizing Rare Disease Gait Classification through Data Balancing and Generative AI: Insights from Hereditary Cerebellar Ataxia

Dante Trabassi,
Stefano Filippo Castiglia,
Fabiano Bini,
Franco Marinozzi,
Arash Ajoudani,
Marta Lorenzini,
Giorgia Chini,
Tiwana Varrecchia,
Alberto Ranavolo,
Roberto De Icco,
Carlo Casali,
Mariano Serrao

Affiliations

Dante Trabassi: Department of Medical and Surgical Sciences and Biotechnologies, “Sapienza” University of Rome, 04100 Latina, Italy
Stefano Filippo Castiglia: Department of Medical and Surgical Sciences and Biotechnologies, “Sapienza” University of Rome, 04100 Latina, Italy
Fabiano Bini: Department of Mechanical and Aerospace Engineering, Sapienza University of Rome, 00184 Rome, Italy
Franco Marinozzi: Department of Mechanical and Aerospace Engineering, Sapienza University of Rome, 00184 Rome, Italy
Arash Ajoudani: Department of Advanced Robotics, Italian Institute of Technology, 16163 Genoa, Italy
Marta Lorenzini: Department of Advanced Robotics, Italian Institute of Technology, 16163 Genoa, Italy
Giorgia Chini: Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL, Monte Porzio Catone, 00078 Rome, Italy
Tiwana Varrecchia: Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL, Monte Porzio Catone, 00078 Rome, Italy
Alberto Ranavolo: Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL, Monte Porzio Catone, 00078 Rome, Italy
Roberto De Icco: Department of Brain and Behavioral Sciences, University of Pavia, 27100 Pavia, Italy
Carlo Casali: Department of Medical and Surgical Sciences and Biotechnologies, “Sapienza” University of Rome, 04100 Latina, Italy
Mariano Serrao: Department of Medical and Surgical Sciences and Biotechnologies, “Sapienza” University of Rome, 04100 Latina, Italy

DOI: https://doi.org/10.3390/s24113613
Journal volume & issue: Vol. 24, no. 11
p. 3613

Abstract

Read online

The interpretability of gait analysis studies in people with rare diseases, such as those with primary hereditary cerebellar ataxia (pwCA), is frequently limited by the small sample sizes and unbalanced datasets. The purpose of this study was to assess the effectiveness of data balancing and generative artificial intelligence (AI) algorithms in generating synthetic data reflecting the actual gait abnormalities of pwCA. Gait data of 30 pwCA (age: 51.6 ± 12.2 years; 13 females, 17 males) and 100 healthy subjects (age: 57.1 ± 10.4; 60 females, 40 males) were collected at the lumbar level with an inertial measurement unit. Subsampling, oversampling, synthetic minority oversampling, generative adversarial networks, and conditional tabular generative adversarial networks (ctGAN) were applied to generate datasets to be input to a random forest classifier. Consistency and explainability metrics were also calculated to assess the coherence of the generated dataset with known gait abnormalities of pwCA. ctGAN significantly improved the classification performance compared with the original dataset and traditional data augmentation methods. ctGAN are effective methods for balancing tabular datasets from populations with rare diseases, owing to their ability to improve diagnostic models with consistent explainability.

Published in Sensors

ISSN: 1424-8220 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Chemical technology
Website: http://www.mdpi.com/journal/sensors

About the journal

Abstract

Keywords