Virology Journal (May 2021)

The genetic variability, phylogeny and functional significance of E6, E7 and LCR in human papillomavirus type 52 isolates in Sichuan, China

  • Zhilin Song,
  • Yanru Cui,
  • Qiufu Li,
  • Junhang Deng,
  • Xianping Ding,
  • Jiaoyu He,
  • Yiran Liu,
  • Zhuang Ju,
  • Liyuan Fang

DOI
https://doi.org/10.1186/s12985-021-01565-5
Journal volume & issue
Vol. 18, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background Variations in human papillomavirus (HPV) E6 and E7 have been shown to be closely related to the persistence of the virus and the occurrence and development of cervical cancer. Long control region (LCR) of HPV has been shown multiple functions on regulating viral transcription. In recent years, there have been reports on E6/E7/LCR of HPV-16 and HPV-58, but there are few studies on HPV-52, especially for LCR. In this study, we focused on gene polymorphism of the HPV-52 E6/E7/LCR sequences, assessed the effects of variations on the immune recognition of viral E6 and E7 antigens, predicted the effect of LCR variations on transcription factor binding sites and provided more basic date for further study of E6/E7/LCR in Chengdu, China. Methods LCR/E6/E7 of the HPV-52 were amplified and sequenced to do polymorphic and phylogenetic analysis. Sequences were aligned with the reference sequence by MEGA 7.0 to identify SNP. A neighbor-joining phylogenetic tree was constructed by MEGA 7.0, followed by the secondary structure prediction of the related proteins using PSIPRED 4.0. The selection pressure of E6 and E7 coding regions were estimated by Bayes empirical Bayes analysis of PAML 4.9. The HLA class-I and II binding peptides were predicted by the Immune Epitope Database server. The B cell epitopes were predicted by ABCpred server. Transcription factor binding sites in LCR were predicted by JASPAR database. Results 50 SNP sites (6 in E6, 10 in E7, 34 in LCR) were found. From the most variable to the least variable, the nucleotide variations were LCR > E7 > E6. Two deletions were found between the nucleotide sites 7387–7391 (TTATG) and 7698–7700 (CTT) in all samples. A deletion was found between the nucleotide sites 7287–7288 (TG) in 97.56% (40/41) of the samples. The combinations of all the SNP sites and deletions resulted in 12 unique sequences. As shown in the neighbor-joining phylogenetic tree, except for one belonging to sub-lineage C2, others sequences clustered into sub-lineage B2. No positive selection was observed in E6 and E7. 8 non-synonymous amino acid substitutions (including E3Q and K93R in the E6, and T37I, S52D, Y59D, H61Y, D64N and L99R in the E7) were potential affecting multiple putative epitopes for both CD4+ and CD8+ T-cells and B-cells. A7168G was the most variable site (100%) and the binding sites for transcription factor VAX1 in LCR. In addition, the prediction results showed that LCR had the high probability binding sites for transcription factors SOX9, FOS, RAX, HOXA5, VAX1 and SRY. Conclusion This study provides basic data for understanding the relation among E6/E7/LCR mutations, lineages and carcinogenesis. Furthermore, it provides an insight into the intrinsic geographical relatedness and biological differences of the HPV-52 variants, and contributes to further research on the HPV-52 therapeutic vaccine development.

Keywords