PLoS ONE (Jan 2020)
Using association rule mining to jointly detect clinical features and differentially expressed genes related to chronic inflammatory diseases.
Abstract
ObjectiveIt is increasingly common to find patients affected by a combination of type 2 diabetes mellitus (T2DM), dyslipidemia (DLP) and periodontitis (PD), which are chronic inflammatory diseases. More studies able to capture unknown relationships among these diseases will contribute to raise biological and clinical evidence. The aim of this study was to apply association rule mining (ARM) to discover whether there are consistent patterns of clinical features (CFs) and differentially expressed genes (DEGs) relevant to these diseases. We intend to reinforce the evidence of the T2DM-DLP-PD-interplay and demonstrate the ARM ability to provide new insights into multivariate pattern discovery.MethodsWe utilized 29 clinical glycemic, lipid and periodontal parameters from 143 patients divided into five groups based upon diabetic, dyslipidemic and periodontal conditions (including a healthy-control group). At least 5 patients from each group were selected to assess the transcriptome by microarray. ARM was utilized to assess relevant association rules considering: (i) only CFs; and (ii) CFs+DEGs, such that the identified DEGs, specific to each group of patients, were submitted to gene expression validation by quantitative polymerase chain reaction (qPCR).ResultsWe obtained 78 CF-rules and 161 CF+DEG-rules. Based on their clinical significance, Periodontists and Geneticist experts selected 11 CF-rules, and 5 CF+DEG-rules. From the five DEGs prospected by the rules, four of them were validated by qPCR as significantly different from the control group; and two of them validated the previous microarray findings.ConclusionsARM was a powerful data analysis technique to identify multivariate patterns involving clinical and molecular profiles of patients affected by specific pathological panels. ARM proved to be an effective mining approach to analyze gene expression with the advantage of including patient's CFs. A combination of CFs and DEGs might be employed in modeling the patient's chance to develop complex diseases, such as those studied here.