npj Precision Oncology (Jun 2024)

Artificial intelligence-based epigenomic, transcriptomic and histologic signatures of tobacco use in oral squamous cell carcinoma

  • Chi T. Viet,
  • Kesava R. Asam,
  • Gary Yu,
  • Emma C. Dyer,
  • Sara Kochanny,
  • Carissa M. Thomas,
  • Nicholas F. Callahan,
  • Anthony B. Morlandt,
  • Allen C. Cheng,
  • Ashish A. Patel,
  • Dylan F. Roden,
  • Simon Young,
  • James Melville,
  • Jonathan Shum,
  • Paul C. Walker,
  • Khanh K. Nguyen,
  • Stephanie N. Kidd,
  • Steve C. Lee,
  • Gretchen S. Folk,
  • Dan T. Viet,
  • Anupama Grandhi,
  • Jeremy Deisch,
  • Yi Ye,
  • Fatemeh Momen-Heravi,
  • Alexander T. Pearson,
  • Bradley E. Aouizerat

DOI
https://doi.org/10.1038/s41698-024-00605-x
Journal volume & issue
Vol. 8, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Oral squamous cell carcinoma (OSCC) biomarker studies rarely employ multi-omic biomarker strategies and pertinent clinicopathologic characteristics to predict mortality. In this study we determine for the first time a combined epigenetic, gene expression, and histology signature that differentiates between patients with different tobacco use history (heavy tobacco use with ≥10 pack years vs. no tobacco use). Using The Cancer Genome Atlas (TCGA) cohort (n = 257) and an internal cohort (n = 40), we identify 3 epigenetic markers (GPR15, GNG12, GDNF) and 13 expression markers (IGHA2, SCG5, RPL3L, NTRK1, CD96, BMP6, TFPI2, EFEMP2, RYR3, DMTN, GPD2, BAALC, and FMO3), which are dysregulated in OSCC patients who were never smokers vs. those who have a ≥ 10 pack year history. While mortality risk prediction based on smoking status and clinicopathologic covariates alone is inaccurate (c-statistic = 0.57), the combined epigenetic/expression and histologic signature has a c-statistic = 0.9409 in predicting 5-year mortality in OSCC patients.