Genome Medicine (Dec 2024)
Using multiplexed functional data to reduce variant classification inequities in underrepresented populations
Abstract
Abstract Background Multiplexed Assays of Variant Effects (MAVEs) can test all possible single variants in a gene of interest. The resulting saturation-style functional data may help resolve variant classification disparities between populations, especially for Variants of Uncertain Significance (VUS). Methods We analyzed clinical significance classifications in 213,663 individuals of European-like genetic ancestry versus 206,975 individuals of non-European-like genetic ancestry from All of Us and the Genome Aggregation Database. Then, we incorporated clinically calibrated MAVE data into the Clinical Genome Resource’s Variant Curation Expert Panel rules to automate VUS reclassification for BRCA1, TP53, and PTEN. Results Using two orthogonal statistical approaches, we show a higher prevalence (p ≤ 5.95e − 06) of VUS in individuals of non-European-like genetic ancestry across all medical specialties assessed in all three databases. Further, in the non-European-like genetic ancestry group, higher rates of Benign or Likely Benign and variants with no clinical designation (p ≤ 2.5e − 05) were found across many medical specialties, whereas Pathogenic or Likely Pathogenic assignments were increased in individuals of European-like genetic ancestry (p ≤ 2.5e − 05). Using MAVE data, we reclassified VUS in individuals of non-European-like genetic ancestry at a significantly higher rate in comparison to reclassified VUS from European-like genetic ancestry (p = 9.1e − 03) effectively compensating for the VUS disparity. Further, essential code analysis showed equitable impact of MAVE evidence codes but inequitable impact of allele frequency (p = 7.47e − 06) and computational predictor (p = 6.92e − 05) evidence codes for individuals of non-European-like genetic ancestry. Conclusions Generation of saturation-style MAVE data should be a priority to reduce VUS disparities and produce equitable training data for future computational predictors.
Keywords