IEEE Access (Jan 2022)

Application of Machine Learning to Environmental DNA Metabarcoding

  • Mutsumi Kimura,
  • Hiroki Yamanaka,
  • Yasuhiko Nakashima

DOI
https://doi.org/10.1109/ACCESS.2022.3207173
Journal volume & issue
Vol. 10
pp. 101790 – 101794

Abstract

Read online

Machine learning is known as an effective technique to classify big data, and coding methods for input into neural networks are critical. In this study, we have applied the machine learning to environmental deoxyribonucleic acid (DNA) metabarcoding. We propose three coding methods of the nucleic acid sequences, and one of them is two-bit coding. In this coding method, one base is represented by two bits, and four types of bases are associated with two-bit binary numbers. A three-layer perceptron is used as a neural network, and backpropagation is used as a learning rule. The accuracy for the two-bit coding is slightly higher than those for the other coding methods, and it can be decided that the two-bit coding is the best coding method mainly because of its advantage, namely, the amount of information required to indicate the nucleic acid sequences is relatively small. We conclude that the environmental DNA metabarcoding is an interesting example as a novel application of machine learning.

Keywords