Exploring Computational Data Amplification and Imputation for the Discovery of Type 1 Diabetes (T1D) Biomarkers from Limited Human Datasets

Oscar Alcazar; Mitsunori Ogihara; Gang Ren; Peter Buchwald; Midhat H. Abdulreda

doi:10.3390/biom12101444

Biomolecules (Oct 2022)

Exploring Computational Data Amplification and Imputation for the Discovery of Type 1 Diabetes (T1D) Biomarkers from Limited Human Datasets

Oscar Alcazar,
Mitsunori Ogihara,
Gang Ren,
Peter Buchwald,
Midhat H. Abdulreda

Affiliations

Oscar Alcazar: Diabetes Research Institute, University of Miami Miller School of Medicine, Miami, FL 33136, USA
Mitsunori Ogihara: Institute for Data Science and Computing, University of Miami, Coral Gables, FL 33146, USA
Gang Ren: Institute for Data Science and Computing, University of Miami, Coral Gables, FL 33146, USA
Peter Buchwald: Diabetes Research Institute, University of Miami Miller School of Medicine, Miami, FL 33136, USA
Midhat H. Abdulreda: Diabetes Research Institute, University of Miami Miller School of Medicine, Miami, FL 33136, USA

DOI: https://doi.org/10.3390/biom12101444
Journal volume & issue: Vol. 12, no. 10
p. 1444

Abstract

Read online

Background: Type 1 diabetes (T1D) is a devastating disease with serious health complications. Early T1D biomarkers that could enable timely detection and prevention before the onset of clinical symptoms are paramount but currently unavailable. Despite their promise, omics approaches have so far failed to deliver such biomarkers, likely due to the fragmented nature of information obtained through the single omics approach. We recently demonstrated the utility of parallel multi-omics for the identification of T1D biomarker signatures. Our studies also identified challenges. Methods: Here, we evaluated a novel computational approach of data imputation and amplification as one way to overcome challenges associated with the relatively small number of subjects in these studies. Results: Using proprietary algorithms, we amplified our quadra-omics (proteomics, metabolomics, lipidomics, and transcriptomics) dataset from nine subjects a thousand-fold and analyzed the data using Ingenuity Pathway Analysis (IPA) software to assess the change in its analytical capabilities and biomarker prediction power in the amplified datasets compared to the original. These studies showed the ability to identify an increased number of T1D-relevant pathways and biomarkers in such computationally amplified datasets, especially, at imputation ratios close to the “golden ratio” of 38.2%:61.8%. Specifically, the Canonical Pathway and Diseases and Functions modules identified higher numbers of inflammatory pathways and functions relevant to autoimmune T1D, including novel ones not identified in the original data. The Biomarker Prediction module also predicted in the amplified data several unique biomarker candidates with direct links to T1D pathogenesis. Conclusions: These preliminary findings indicate that such large-scale data imputation and amplification approaches are useful in facilitating the discovery of candidate integrated biomarker signatures of T1D or other diseases by increasing the predictive range of existing data mining tools, especially when the size of the input data is inherently limited.

Published in Biomolecules

ISSN: 2218-273X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Microbiology
Website: https://www.mdpi.com/journal/biomolecules

About the journal

Abstract

Keywords