npj Genomic Medicine (Nov 2021)
Recurrent integration of human papillomavirus genomes at transcriptional regulatory hubs
Abstract
Abstract Oncogenic human papillomavirus (HPV) genomes are often integrated into host chromosomes in HPV-associated cancers. HPV genomes are integrated either as a single copy or as tandem repeats of viral DNA interspersed with, or without, host DNA. Integration occurs frequently in common fragile sites susceptible to tandem repeat formation and the flanking or interspersed host DNA often contains transcriptional enhancer elements. When co-amplified with the viral genome, these enhancers can form super-enhancer-like elements that drive high viral oncogene expression. Here we compiled highly curated datasets of HPV integration sites in cervical (CESC) and head and neck squamous cell carcinoma (HNSCC) cancers, and assessed the number of breakpoints, viral transcriptional activity, and host genome copy number at each insertion site. Tumors frequently contained multiple distinct HPV integration sites but often only one “driver” site that expressed viral RNA. As common fragile sites and active enhancer elements are cell-type-specific, we mapped these regions in cervical cell lines using FANCD2 and Brd4/H3K27ac ChIP-seq, respectively. Large enhancer clusters, or super-enhancers, were also defined using the Brd4/H3K27ac ChIP-seq dataset. HPV integration breakpoints were enriched at both FANCD2-associated fragile sites and enhancer-rich regions, and frequently showed adjacent focal DNA amplification in CESC samples. We identified recurrent integration “hotspots” that were enriched for super-enhancers, some of which function as regulatory hubs for cell-identity genes. We propose that during persistent infection, extrachromosomal HPV minichromosomes associate with these transcriptional epicenters and accidental integration could promote viral oncogene expression and carcinogenesis.