Scientific Reports (May 2023)
Targeted adaptive long-read sequencing for discovery of complex phased variants in inherited retinal disease patients
Abstract
Abstract Inherited retinal degenerations (IRDs) are a heterogeneous group of predominantly monogenic disorders with over 300 causative genes identified. Short-read exome sequencing is commonly used to genotypically diagnose patients with clinical features of IRDs, however, in up to 30% of patients with autosomal recessive IRDs, one or no disease-causing variants are identified. Furthermore, chromosomal maps cannot be reconstructed for allelic variant discovery with short-reads. Long-read genome sequencing can provide complete coverage of disease loci and a targeted approach can focus sequencing bandwidth to a genomic region of interest to provide increased depth and haplotype reconstruction to uncover cases of missing heritability. We demonstrate that targeted adaptive long-read sequencing on the Oxford Nanopore Technologies (ONT) platform of the USH2A gene from three probands in a family with the most common cause of the syndromic IRD, Usher Syndrome, resulted in greater than 12-fold target gene sequencing enrichment on average. This focused depth of sequencing allowed for haplotype reconstruction and phased variant identification. We further show that variants obtained from the haplotype-aware genotyping pipeline can be heuristically ranked to focus on potential pathogenic candidates without a priori knowledge of the disease-causing variants. Moreover, consideration of the variants unique to targeted long-read sequencing that are not covered by short-read technology demonstrated higher precision and F1 scores for variant discovery by long-read sequencing. This work establishes that targeted adaptive long-read sequencing can generate targeted, chromosome-phased data sets for identification of coding and non-coding disease-causing alleles in IRDs and can be applicable to other Mendelian diseases.