PeerJ (Jul 2023)
Where are the Penaeids crustins?
Abstract
Crustins are antimicrobial peptides and members of the four-disulfide core (4-DSC) domain-containing proteins superfamily. To date, crustins have only been reported in crustaceans and possess a structural signature characterized by a single 4-DSC domain and one cysteine-rich region. The high-throughput sequencing technologies have produced vastly valuable genomic information that sometimes dilutes information about previously sequenced molecules. This study aimed (1) to corroborate the loss of valuable descriptive information regarding crustin identification when high throughput sequencing carries out automatic annotation processes and (2) to detect possible crustin sequences reported in Penaeids to attempt a list considering structural similarities, which allows the establishment of phylogenetic relationships based on molecular characteristics. All crustins sequences reported in Penaeids and registered in the databases were obtained. The first list was made with the proteins reported as crustin or carcinin, excluding those that did not meet the structural characteristics. Subsequently, using local alignments, sequences were sought with high similarity even if they had been reported with a different name of crustin but with a probability of being crustin. This broader list, including proteins with high structural similarity, can help establish phylogenetic relationships of shrimp genes and the evolutionary trajectory of this antimicrobial distributed exclusively among crustaceans. Results revealed that in most sequences obtained by Sanger or transcriptomics, which met the structural criteria, the identification was correctly established as crustin. Contrarily, the sequences corresponding to crustins obtained by whole genome sequencing projects were incorrectly classified or not characterized, being momentarily “buried” in the information generated. In addition, the sequences that complied with the criteria of crustin tended to be grouped into species separated by geographical regions; for example, the crustins of the inhabitant shrimp of the American coasts differ from those corresponding to the natives of the Asian coasts. Finally, the results suggest the convenience of annotations considering the previous but correct information, even if such information was generated with previous technologies.
Keywords