BMC Ecology and Evolution (Jul 2021)

DeepHBV: a deep learning model to predict hepatitis B virus (HBV) integration sites

  • Canbiao Wu,
  • Xiaofang Guo,
  • Mengyuan Li,
  • Jingxian Shen,
  • Xiayu Fu,
  • Qingyu Xie,
  • Zeliang Hou,
  • Manman Zhai,
  • Xiaofan Qiu,
  • Zifeng Cui,
  • Hongxian Xie,
  • Pengmin Qin,
  • Xuchu Weng,
  • Zheng Hu,
  • Jiuxing Liang

DOI
https://doi.org/10.1186/s12862-021-01869-8
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background The hepatitis B virus (HBV) is one of the main causes of viral hepatitis and liver cancer. HBV integration is one of the key steps in the virus-promoted malignant transformation. Results An attention-based deep learning model, DeepHBV, was developed to predict HBV integration sites. By learning local genomic features automatically, DeepHBV was trained and tested using HBV integration site data from the dsVIS database. Initially, DeepHBV showed an AUROC of 0.6363 and an AUPR of 0.5471 for the dataset. The integration of genomic features of repeat peaks and TCGA Pan-Cancer peaks significantly improved model performance, with AUROCs of 0.8378 and 0.9430 and AUPRs of 0.7535 and 0.9310, respectively. The transcription factor binding sites (TFBS) were significantly enriched near the genomic positions that were considered. The binding sites of the AR-halfsite, Arnt, Atf1, bHLHE40, bHLHE41, BMAL1, CLOCK, c-Myc, COUP-TFII, E2A, EBF1, Erra, and Foxo3 were highlighted by DeepHBV in both the dsVIS and VISDB datasets, revealing a novel integration preference for HBV. Conclusions DeepHBV is a useful tool for predicting HBV integration sites, revealing novel insights into HBV integration-related carcinogenesis.

Keywords