G3: Genes, Genomes, Genetics (Jul 2020)

Improving the Chromosome-Level Genome Assembly of the Siamese Fighting Fish (Betta splendens) in a University Master’s Course

  • Stefan Prost,
  • Malte Petersen,
  • Martin Grethlein,
  • Sarah Joy Hahn,
  • Nina Kuschik-Maczollek,
  • Martyna Ewa Olesiuk,
  • Jan-Olaf Reschke,
  • Tamara Elke Schmey,
  • Caroline Zimmer,
  • Deepak K. Gupta,
  • Tilman Schell,
  • Raphael Coimbra,
  • Jordi De Raad,
  • Fritjof Lammers,
  • Sven Winter,
  • Axel Janke

DOI
https://doi.org/10.1534/g3.120.401205
Journal volume & issue
Vol. 10, no. 7
pp. 2179 – 2183

Abstract

Read online

Ever decreasing costs along with advances in sequencing and library preparation technologies enable even small research groups to generate chromosome-level assemblies today. Here we report the generation of an improved chromosome-level assembly for the Siamese fighting fish (Betta splendens) that was carried out during a practical university master’s course. The Siamese fighting fish is a popular aquarium fish and an emerging model species for research on aggressive behavior. We updated the current genome assembly by generating a new long-read nanopore-based assembly with subsequent scaffolding to chromosome-level using previously published Hi-C data. The use of ∼35x nanopore-based long-read data sequenced on a MinION platform (Oxford Nanopore Technologies) allowed us to generate a baseline assembly of only 1,276 contigs with a contig N50 of 2.1 Mbp, and a total length of 441 Mbp. Scaffolding using the Hi-C data resulted in 109 scaffolds with a scaffold N50 of 20.7 Mbp. More than 99% of the assembly is comprised in 21 scaffolds. The assembly showed the presence of 96.1% complete BUSCO genes from the Actinopterygii dataset indicating a high quality of the assembly. We present an improved full chromosome-level assembly of the Siamese fighting fish generated during a university master’s course. The use of ∼35× long-read nanopore data drastically improved the baseline assembly in terms of continuity. We show that relatively in-expensive high-throughput sequencing technologies such as the long-read MinION sequencing platform can be used in educational settings allowing the students to gain practical skills in modern genomics and generate high quality results that benefit downstream research projects.

Keywords