Alexandria Engineering Journal (Nov 2024)
Machine learning approaches for advanced detection of rare genetic disorders in whole-genome sequencing
Abstract
Uncommon genetic illnesses pose significant challenges for detection due to their low occurrence and intricate genetic makeup. Traditional targeted genetic testing methods are often limited, missing rare or unidentified genetic variations. This project proposes a novel strategy that leverages Whole-Genome Sequencing (WGS) data and Random Forest (RF) analysis to overcome these limitations. WGS provides a comprehensive view of an individual's genetic profile, capturing a wide array of genetic variations that targeted approaches might overlook. By employing the RF method, which excels at handling complex datasets and detecting non-linear interactions, this project aims to uncover intricate links between rare genetic diseases and gene variations. The Swedish Genome Reference dataset will serve as the foundation for this research. RF analysis will be applied to this extensive dataset to identify patterns and connections that might reveal new genetic markers and previously unknown risk factors for these illnesses. This approach allows for exploring vast genetic datasets to detect structures and associations, providing deeper insights into the genetic underpinnings of rare diseases. Combining WGS with RF analysis offers a powerful tool for discovering genetic indicators and risk factors contributing to rare genetic disorders, achieving an accuracy rate of 97 %. This innovative approach can significantly enhance understanding, diagnosis, and treatment of these conditions. By highlighting the value of advanced computational techniques and comprehensive WGS databases, the project aims to pave the way for more personalized and specialized medical treatments, ultimately improving patient outcomes for those affected by rare genetic diseases.