Archives of Biological Sciences (Jan 2023)

Use of data mining to establish associations between Indian marine fish catch and environmental data

  • Gladju Joseph,
  • Kanagaraj Ayyasamy,
  • Kamalam Biju Sam

DOI
https://doi.org/10.2298/ABS230909037G
Journal volume & issue
Vol. 75, no. 4
pp. 459 – 474

Abstract

Read online

For decades, changes in fish catch composition and the marine environment have been monitored worldwide and recorded in databases like FAO FishStatJ and the European Union Copernicus Marine Service. However, the complexity and high variability in the dataset makes it challenging to find meaningful information through conventional data analytical methods. Therefore, in this pilot data mining study, we employed association rule mining algorithms (Apriori, ECLAT, and FP-Growth) to find frequently occurring itemsets in the fish-catch composition and marine environment data of the west and east coasts of India during the past decade (2011-2020). Firstly, the inherent spatial and temporal variations in fish-catch composition and marine environment (sea surface temperature and chlorophyll) on the west and east coasts of India were statistically analyzed and described. Then, the data were preprocessed, selected, and transformed into categorical attributes. By applying the association rule mining algorithms written in the Python language in the Google Colab workspace, we obtained frequent itemsets of fish catch and marine environment with different levels of minimum support and confidence. The preliminary results showed linear and inverse associations between changes in the sea surface temperature, chlorophyll concentration, and major catch groups, such as anchovies, Indian oil sardine, Indian mackerel, hairtails, butterfish-pomfrets, Bombay duck, flatfish, tunas, giant tiger prawn, crabs, lobsters, and cephalopods. Among the tested data mining algorithms, FP-Growth was found to be more efficient and reliable in finding associations between the spatiotemporal dynamics of the marine environment and fish distribution and abundance. Therefore, it can be potentially used to support marine fisheries’ resource assessment and management strategies after refinement.

Keywords