Data in Brief (Jun 2024)

First transcriptome sequencing, assembly, and annotation dataset for the freshwater angelfish, pterophyllum scalare

  • Indeever Madireddy

Journal volume & issue
Vol. 54
p. 110400

Abstract

Read online

Cichlids are relevant to biological research for their craniofacial variations that are analogous to human structure and associated congenital anomalies. However, only a limited number of cichlids have genetic information available. Investigating cichlids and adding to the body of knowledge about them may provide better insights into studying developmental biology and craniofacial structure. The angelfish, Pterophyllum scalare, is one cichlid for which we lack genetic information including a draft transcriptome assembly.This work is the first to provide a draft transcriptome and annotation using long-read Nanopore sequencing for P. scalare. Total RNA was extracted from angelfish tissue, and a cDNA-PCR library was prepared. Sequencing was performed on a singular R.9.4.1 MinION flow cell for 84 h. Various bioinformatic tools were then employed to assemble the sequencing reads into a transcriptome. The transcriptome was then annotated against various databases.23 million sequencing reads were collected totalling 21.9 Gb. The N50 sequencing read length was 1255 bp and the mean read length was 938. The data had an initial mean Phred score of 10.04. After assembly, the final transcriptome consists of 98,125 transcripts with a mean length of 1552 and N50 length of 2277. The transcriptome has a completeness of 80.5% as assessed by BUSCO. Functional annotation revealed pathways related to signal transduction, carbohydrate metabolism, and transcription are the most annotated in the transcriptome.

Keywords