PeerJ (Sep 2024)

Identifying transgene insertions in Caenorhabditis elegans genomes with Oxford Nanopore sequencing

  • Paula E. Adams,
  • Jennifer L. Thies,
  • John M. Sutton,
  • Joshua D. Millwood,
  • Guy A. Caldwell,
  • Kim A. Caldwell,
  • Janna L. Fierst

DOI
https://doi.org/10.7717/peerj.18100
Journal volume & issue
Vol. 12
p. e18100

Abstract

Read online Read online

Genetically modified organisms are commonly used in disease research and agriculture but the precise genomic alterations underlying transgenic mutations are often unknown. The position and characteristics of transgenes, including the number of independent insertions, influences the expression of both transgenic and wild-type sequences. We used long-read, Oxford Nanopore Technologies (ONT) to sequence and assemble two transgenic strains of Caenorhabditis elegans commonly used in the research of neurodegenerative diseases: BY250 (pPdat-1::GFP) and UA44 (GFP and human α-synuclein), a model for Parkinson’s research. After scaffolding to the reference, the final assembled sequences were ∼102 Mb with N50s of 17.9 Mb and 18.0 Mb, respectively, and L90s of six contiguous sequences, representing chromosome-level assemblies. Each of the assembled sequences contained more than 99.2% of the Nematoda BUSCO genes found in the C. elegans reference and 99.5% of the annotated C. elegans reference protein-coding genes. We identified the locations of the transgene insertions and confirmed that all transgene sequences were inserted in intergenic regions, leaving the organismal gene content intact. The transgenic C. elegans genomes presented here will be a valuable resource for Parkinson’s research as well as other neurodegenerative diseases. Our work demonstrates that long-read sequencing is a fast, cost-effective way to assemble genome sequences and characterize mutant lines and strains.

Keywords