GigaByte (Jun 2022)

Long-read HiFi sequencing correctly assembles repetitive heavy fibroin silk genes in new moth and caddisfly genomes

  • Akito Y. Kawahara ,
  • Caroline G. Storer ,
  • Amanda Markee ,
  • Jacqueline Heckenhauer ,
  • Ashlyn Powell ,
  • David Plotkin ,
  • Scott Hotaling ,
  • Timothy P. Cleland ,
  • Rebecca B. Dikow ,
  • Torsten Dikow ,
  • Ryoichi B. Kuranishi ,
  • Rebeccah Messcher ,
  • Steffen U. Pauls ,
  • Russell J. Stewart ,
  • Koji Tojo ,
  • Paul B. Frandsen

DOI
https://doi.org/10.46471/gigabyte.64

Abstract

Read online

Insect silk is a versatile biomaterial. Lepidoptera and Trichoptera display some of the most diverse uses of silk, with varying strength, adhesive qualities, and elastic properties. Silk fibroin genes are long (>20 Kbp), with many repetitive motifs that make them challenging to sequence. Most research thus far has focused on conserved N- and C-terminal regions of fibroin genes because a full comparison of repetitive regions across taxa has not been possible. Using the PacBio Sequel II system and SMRT sequencing, we generated high fidelity (HiFi) long-read genomic and transcriptomic sequences for the Indianmeal moth (Plodia interpunctella) and genomic sequences for the caddisfly Eubasilissa regina. Both genomes were highly contiguous (N50 = 9.7 Mbp/32.4 Mbp, L50 = 13/11) and complete (BUSCO complete = 99.3%/95.2%), with complete and contiguous recovery of silk heavy fibroin gene sequences. We show that HiFi long-read sequencing is helpful for understanding genes with long, repetitive regions.