BioMedInformatics (Jun 2023)
Uncovering Disease-Related Polymorphisms through Correlations between SNP Frequencies, Population and Epidemiological Data
Abstract
Background: According to GWAS, which analyzes large amounts of DNA variants in case-control strategies, the genetic differences between two human individuals do not exceed 0.5%. As a consequence, finding biological significance in GWAS results is a challenging task. We propose an alternative method for identifying disease-causing variants based on the simultaneous evaluation of genome variant data acquired from public databases and pathology epidemiological data. This method is grounded on the following premise: If a particular pathology is common in a community, genetic variants that confer susceptibility to that pathology should also be common in that population. Methods: Three groups of genes were evaluated to test this premise: variants related to depression found through GWAS, six genes unrelated to depression, and four genes already genotyped in case-control studies involving depression (TPH2, NR3C1, SLC6A2 and SLC6A3). In terms of GWAS depression-related variants, nine of the 82 SNPs evaluated showed a favorable correlation between allele frequency and epidemiological data. As anticipated, none of the 286 SNPs were correlated in the neutral group. In terms of proof of concept, two THP2 variants, 26 NR3C1 variants and four SLC6A3 variants were found to be related to depression rates and epidemiological statistics. Conclusions: Together with data from the literature involving these SNPs, these correlations support this strategy as a complementary method for identifying possible disease-causing variants.
Keywords