Cell Discovery (Sep 2022)

Artificial intelligence defines protein-based classification of thyroid nodules

  • Yaoting Sun,
  • Sathiyamoorthy Selvarajan,
  • Zelin Zang,
  • Wei Liu,
  • Yi Zhu,
  • Hao Zhang,
  • Wanyuan Chen,
  • Hao Chen,
  • Lu Li,
  • Xue Cai,
  • Huanhuan Gao,
  • Zhicheng Wu,
  • Yongfu Zhao,
  • Lirong Chen,
  • Xiaodong Teng,
  • Sangeeta Mantoo,
  • Tony Kiat-Hon Lim,
  • Bhuvaneswari Hariraman,
  • Serene Yeow,
  • Syed Muhammad Fahmy Alkaff,
  • Sze Sing Lee,
  • Guan Ruan,
  • Qiushi Zhang,
  • Tiansheng Zhu,
  • Yifan Hu,
  • Zhen Dong,
  • Weigang Ge,
  • Qi Xiao,
  • Weibin Wang,
  • Guangzhi Wang,
  • Junhong Xiao,
  • Yi He,
  • Zhihong Wang,
  • Wei Sun,
  • Yuan Qin,
  • Jiang Zhu,
  • Xu Zheng,
  • Linyan Wang,
  • Xi Zheng,
  • Kailun Xu,
  • Yingkuan Shao,
  • Shu Zheng,
  • Kexin Liu,
  • Ruedi Aebersold,
  • Haixia Guan,
  • Xiaohong Wu,
  • Dingcun Luo,
  • Wen Tian,
  • Stan Ziqing Li,
  • Oi Lian Kon,
  • Narayanan Gopalakrishna Iyer,
  • Tiannan Guo

DOI
https://doi.org/10.1038/s41421-022-00442-x
Journal volume & issue
Vol. 8, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Determination of malignancy in thyroid nodules remains a major diagnostic challenge. Here we report the feasibility and clinical utility of developing an AI-defined protein-based biomarker panel for diagnostic classification of thyroid nodules: based initially on formalin-fixed paraffin-embedded (FFPE), and further refined for fine-needle aspiration (FNA) tissue specimens of minute amounts which pose technical challenges for other methods. We first developed a neural network model of 19 protein biomarkers based on the proteomes of 1724 FFPE thyroid tissue samples from a retrospective cohort. This classifier achieved over 91% accuracy in the discovery set for classifying malignant thyroid nodules. The classifier was externally validated by blinded analyses in a retrospective cohort of 288 nodules (89% accuracy; FFPE) and a prospective cohort of 294 FNA biopsies (85% accuracy) from twelve independent clinical centers. This study shows that integrating high-throughput proteomics and AI technology in multi-center retrospective and prospective clinical cohorts facilitates precise disease diagnosis which is otherwise difficult to achieve by other methods.