Epigenomes (May 2024)

Statistical Models for High-Risk Intestinal Metaplasia with DNA Methylation Profiling

  • Tianmeng Wang,
  • Yifei Huang,
  • Jie Yang

DOI
https://doi.org/10.3390/epigenomes8020019
Journal volume & issue
Vol. 8, no. 2
p. 19

Abstract

Read online

We consider the newly developed multinomial mixed-link models for a high-risk intestinal metaplasia (IM) study with DNA methylation data. Different from the traditional multinomial logistic models commonly used for categorical responses, the mixed-link models allow us to select the most appropriate link function for each category. We show that the selected multinomial mixed-link model (Model 1) using the total number of stem cell divisions (TNSC) based on DNA methylation data outperforms the traditional logistic models in terms of cross-entropy loss from ten-fold cross-validations with significant p-values 8.12×10−4 and 6.94×10−5. Based on our selected model, the significance of TNSC’s effect in predicting the risk of IM is justified with a p-value less than 10−6. We also select the most appropriate mixed-link models (Models 2 and 3) when an additional covariate, the status of gastric atrophy, is available. When the status is negative, mild, or moderate, we recommend Model 2; otherwise, we prefer Model 3. Both Models 2 and 3 can predict the risk of IM significantly better than Model 1, which justifies that the status of gastric atrophy is informative in predicting the risk of IM.

Keywords