BMC Bioinformatics (Feb 2022)

gcaPDA: a haplotype-resolved diploid assembler

  • Min Xie,
  • Linfeng Yang,
  • Chenglin Jiang,
  • Shenshen Wu,
  • Cheng Luo,
  • Xin Yang,
  • Lijuan He,
  • Shixuan Chen,
  • Tianquan Deng,
  • Mingzhi Ye,
  • Jianbing Yan,
  • Ning Yang

DOI
https://doi.org/10.1186/s12859-022-04591-4
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Background Generating chromosome-scale haplotype resolved assembly is important for functional studies. However, current de novo assemblers are either haploid assemblers that discard allelic information, or diploid assemblers that can only tackle genomes of low complexity. Results Here, Using robust programs, we build a diploid genome assembly pipeline called gcaPDA (gamete cells assisted Phased Diploid Assembler), which exploits haploid gamete cells to assist in resolving haplotypes. We demonstrate the effectiveness of gcaPDA based on simulated HiFi reads of maize genome which is highly heterozygous and repetitive, and real data from rice. Conclusions With applicability of coping with complex genomes and fewer restrictions on application than most of diploid assemblers, gcaPDA is likely to find broad applications in studies of eukaryotic genomes.

Keywords