International Journal of Molecular Sciences (Apr 2022)
The Thousand Polish Genomes—A Database of Polish Variant Allele Frequencies
- Elżbieta Kaja,
- Adrian Lejman,
- Dawid Sielski,
- Mateusz Sypniewski,
- Tomasz Gambin,
- Mateusz Dawidziuk,
- Tomasz Suchocki,
- Paweł Golik,
- Marzena Wojtaszewska,
- Magdalena Mroczek,
- Maria Stępień,
- Joanna Szyda,
- Karolina Lisiak-Teodorczyk,
- Filip Wolbach,
- Daria Kołodziejska,
- Katarzyna Ferdyn,
- Maciej Dąbrowski,
- Alicja Woźna,
- Marcin Żytkiewicz,
- Anna Bodora-Troińska,
- Waldemar Elikowski,
- Zbigniew J. Król,
- Artur Zaczyński,
- Agnieszka Pawlak,
- Robert Gil,
- Waldemar Wierzba,
- Paula Dobosz,
- Katarzyna Zawadzka,
- Paweł Zawadzki,
- Paweł Sztromwasser
Affiliations
- Elżbieta Kaja
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Adrian Lejman
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Dawid Sielski
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Mateusz Sypniewski
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Tomasz Gambin
- Institute of Computer Science, Warsaw University of Technology, 00-665 Warsaw, Poland
- Mateusz Dawidziuk
- Department of Medical Genetics, Institute of Mother and Child, 01-211 Warsaw, Poland
- Tomasz Suchocki
- Biostatistics Group, Wrocław University of Environmental and Life Sciences, 51-631 Wrocław, Poland
- Paweł Golik
- Institute of Genetics and Biotechnology, Faculty of Biology, University of Warsaw, 02-106 Warsaw, Poland
- Marzena Wojtaszewska
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Magdalena Mroczek
- Department of Neurology and Neurophysiology, Balgrist University Hospital, University of Zurich, 8008 Zurich, Switzerland
- Maria Stępień
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Joanna Szyda
- Biostatistics Group, Wrocław University of Environmental and Life Sciences, 51-631 Wrocław, Poland
- Karolina Lisiak-Teodorczyk
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Filip Wolbach
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Daria Kołodziejska
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Katarzyna Ferdyn
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Maciej Dąbrowski
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Alicja Woźna
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Marcin Żytkiewicz
- Department of Internal Medicine, Józef Struś Multidisciplinary Municipal Hospital, 61-285 Poznan, Poland
- Anna Bodora-Troińska
- Department of Internal Medicine, Józef Struś Multidisciplinary Municipal Hospital, 61-285 Poznan, Poland
- Waldemar Elikowski
- Department of Internal Medicine, Józef Struś Multidisciplinary Municipal Hospital, 61-285 Poznan, Poland
- Zbigniew J. Król
- Central Clinical Hospital of Ministry of the Interior and Administration in Warsaw, 02-507 Warsaw, Poland
- Artur Zaczyński
- Central Clinical Hospital of Ministry of the Interior and Administration in Warsaw, 02-507 Warsaw, Poland
- Agnieszka Pawlak
- Central Clinical Hospital of Ministry of the Interior and Administration in Warsaw, 02-507 Warsaw, Poland
- Robert Gil
- Central Clinical Hospital of Ministry of the Interior and Administration in Warsaw, 02-507 Warsaw, Poland
- Waldemar Wierzba
- Central Clinical Hospital of Ministry of the Interior and Administration in Warsaw, 02-507 Warsaw, Poland
- Paula Dobosz
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Katarzyna Zawadzka
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Paweł Zawadzki
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- Paweł Sztromwasser
- MNM Bioscience Inc., Cambridge, MA 02142, USA
- DOI
- https://doi.org/10.3390/ijms23094532
- Journal volume & issue
-
Vol. 23,
no. 9
p. 4532
Abstract
Although Slavic populations account for over 4.5% of world inhabitants, no centralised, open-source reference database of genetic variation of any Slavic population exists to date. Such data are crucial for clinical genetics, biomedical research, as well as archeological and historical studies. The Polish population, which is homogenous and sedentary in its nature but influenced by many migrations of the past, is unique and could serve as a genetic reference for the Slavic nations. In this study, we analysed whole genomes of 1222 Poles to identify and genotype a wide spectrum of genomic variation, such as small and structural variants, runs of homozygosity, mitochondrial haplogroups, and de novo variants. Common variant analyses showed that the Polish cohort is highly homogenous and shares ancestry with other European populations. In rare variant analyses, we identified 32 autosomal-recessive genes with significantly different frequencies of pathogenic alleles in the Polish population as compared to the non-Finish Europeans, including C2, TGM5, NUP93, C19orf12, and PROP1. The allele frequencies for small and structural variants, calculated for 1076 unrelated individuals, are released publicly as The Thousand Polish Genomes database, and will contribute to the worldwide genomic resources available to researchers and clinicians.
Keywords