Bioinformatics and Biology Insights (Feb 2022)
Decoding Human Genome Regulatory Features That Influence HIV-1 Proviral Expression and Fate Through an Integrated Genomics Approach
Abstract
Fundamental principles of HIV-1 integration into the human genome have been revealed in the past 2 decades. However, the impact of the integration site on proviral transcription and expression remains poorly understood. Solving this problem requires the analysis of multiple genomic datasets for thousands of proviral integration sites. Here, we generated and combined large-scale datasets, including epigenetics, transcriptome, and 3-dimensional genome architecture to interrogate the chromatin states, transcription activity, and nuclear sub-compartments around HIV-1 integrations in Jurkat CD4 + T cells to decipher human genome regulatory features shaping the transcription of proviral classes based on their position and orientation in the genome. Through a Hidden Markov Model and ranked informative values prior to a machine learning logistic regression model, we defined nuclear sub-compartments and chromatin states contributing to genomic architecture, transcriptional activity, and nucleosome density of regions neighboring the integration site, as additive features influencing HIV-1 expression. Our integrated genomics approach also allows for a robust experimental design, in which HIV-1 can be genetically introduced into precise genomic locations with known regulatory features to assess the relationship of integration positions to viral transcription and fate.