Insects (Jan 2022)

Current State of DNA Barcoding of Sciaroidea (Diptera)—Highlighting the Need to Build the Reference Library

  • Jostein Kjærandsen

DOI
https://doi.org/10.3390/insects13020147
Journal volume & issue
Vol. 13, no. 2
p. 147

Abstract

Read online

DNA barcoding has tremendous potential for advancing species knowledge for many diverse groups of insects, potentially paving way for machine identification and semi-automated monitoring of whole insect faunas. Here, I review the current state of DNA barcoding of the superfamily Sciaroidea (Diptera), a diverse group consisting of eight understudied fly families where the described species in the world makes up some 10% (≈16,000 species) of all Diptera. World data of Sciaroidea were extracted from the Barcode of Life online database BoldSystems (BOLD) and contrasted with results and experiences from a Nordic project to build the reference library. Well over 1.2 million (1,224,877) Sciaroidea specimens have been submitted for barcoding, giving barcode-compliant sequences resulting in 56,648 so-called barcode index numbers (BINs, machine-generated proxies for species). Although the BINs on BOLD already represent 3.5 times the number of described species, merely some 2850 named species (described or interim names, 5% of the BINs) currently have been assigned a BIN. The other 95% remain as dark taxa figuring in many frontier publications as statistics representing proxies for species diversity within a family. In the Nordic region, however, substantial progress has been made towards building a complete reference library, currently making up 55% of all named Sciaroidea BINs on BOLD. Another major source (31%) of named Sciaroidea BINs on BOLD comes from COI sequences mined from GenBank, generated through phylogenetic and integrative studies outside of BOLD. Building a quality reference library for understudied insects such as Sciaroidea requires heavy investment, both pre sequence and post sequence, by trained taxonomists to build and curate voucher collections, to continually improve the quality of the data and describe new species. Only when the BINs are properly calibrated by a rigorously quality-checked reference library can the great potential of both classical taxonomic barcoding, metabarcoding, and eDNA ecology be realized.

Keywords