Viruses (May 2022)

Convolutional Neural Networks Based on Sequential Spike Predict the High Human Adaptation of SARS-CoV-2 Omicron Variants

  • Bei-Guang Nan,
  • Sen Zhang,
  • Yu-Chang Li,
  • Xiao-Ping Kang,
  • Yue-Hong Chen,
  • Lin Li,
  • Tao Jiang,
  • Jing Li

DOI
https://doi.org/10.3390/v14051072
Journal volume & issue
Vol. 14, no. 5
p. 1072

Abstract

Read online

The COVID-19 pandemic has frequently produced more highly transmissible SARS-CoV-2 variants, such as Omicron, which has produced sublineages. It is a challenge to tell apart high-risk Omicron sublineages and other lineages of SARS-CoV-2 variants. We aimed to build a fine-grained deep learning (DL) model to assess SARS-CoV-2 transmissibility, updating our former coarse-grained model, with the training/validating data of early-stage SARS-CoV-2 variants and based on sequential Spike samples. Sequential amino acid (AA) frequency was decomposed into serially and slidingly windowed fragments in Spike. Unsupervised machine learning approaches were performed to observe the distribution in sequential AA frequency and then a supervised Convolutional Neural Network (CNN) was built with three adaptation labels to predict the human adaptation of Omicron variants in sublineages. Results indicated clear inter-lineage separation and intra-lineage clustering for SARS-CoV-2 variants in the decomposed sequential AAs. Accurate classification by the predictor was validated for the variants with different adaptations. Higher adaptation for the BA.2 sublineage and middle-level adaptation for the BA.1/BA.1.1 sublineages were predicted for Omicron variants. Summarily, the Omicron BA.2 sublineage is more adaptive than BA.1/BA.1.1 and has spread more rapidly, particularly in Europe. The fine-grained adaptation DL model works well for the timely assessment of the transmissibility of SARS-CoV-2 variants, facilitating the control of emerging SARS-CoV-2 variants.

Keywords