BMC Bioinformatics (Apr 2023)

PyAGH: a python package to fast construct kinship matrices based on different levels of omic data

  • Wei Zhao,
  • Qamar Raza Qadri,
  • Zhenyang Zhang,
  • Zhen Wang,
  • Yuchun Pan,
  • Qishan Wang,
  • Zhe Zhang

DOI
https://doi.org/10.1186/s12859-023-05280-6
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background Construction of kinship matrices among individuals is an important step for both association studies and prediction studies based on different levels of omic data. Methods for constructing kinship matrices are becoming diverse and different methods have their specific appropriate scenes. However, software that can comprehensively calculate kinship matrices for a variety of scenarios is still in an urgent demand. Results In this study, we developed an efficient and user-friendly python module, PyAGH, that can accomplish (1) conventional additive kinship matrces construction based on pedigree, genotypes, abundance data from transcriptome or microbiome; (2) genomic kinship matrices construction in combined population; (3) dominant and epistatic effects kinship matrices construction; (4) pedigree selection, tracing, detection and visualization; (5) visualization of cluster, heatmap and PCA analysis based on kinship matrices. The output from PyAGH can be easily integrated in other mainstream software based on users’ purposes. Compared with other softwares, PyAGH integrates multiple methods for calculating the kinship matrix and has advantages in terms of speed and data size compared to other software. PyAGH is developed in python and C + + and can be easily installed by pip tool. Installation instructions and a manual document can be freely available from https://github.com/zhaow-01/PyAGH . Conclusion PyAGH is a fast and user-friendly Python package for calculating kinship matrices using pedigree, genotype, microbiome and transcriptome data as well as processing, analyzing and visualizing data and results. This package makes it easier to perform predictions and association studies processes based on different levels of omic data.

Keywords