IEEE Access (Jan 2023)
ForamViT-GAN: Exploring New Paradigms in Deep Learning for Micropaleontological Image Analysis
Abstract
In geosciences, micropaleontology studies the evolution of microfossils (e.g., foraminifera) throughout geological records and utilizes such information to reconstruct past environmental and climatic conditions. This field depends primarily on the visual recognition of various features in microfossils, which makes it ideal for applying computer vision technology, specifically deep convolutional neural networks (CNNs), to automate and optimize different microfossil identification and classification. In addition, the unlabeled, low-resolution micropaleontological dataset is often available in a large volume compared to another geosciences dataset. While the application of deep learning in micropaleontology is rapidly growing, these efforts have been severely hampered by (i) the limited availability of high-quality and high-resolution labeled fossil images and (ii) significant effort in manually labeling various fossils by subject matter experts. Furthermore, previous works primarily exploited CNN with transfer learning to obtain high-accuracy prediction, which may reduce the explainability and reproducibility of the model. To overcome this issue, we propose a novel deep learning workflow that couples hierarchical vision transformers with style-based generative adversarial network algorithms to efficiently acquire and synthetically generate realistic high-resolution labeled datasets of micropaleontology in a large volume. Our study demonstrates that the proposed workflow could generate high-resolution images with a high signal-to-noise ratio, achieving 39.1 dB, and realistic synthetic images with a Fréchet inception distance similarity score of 14.88. In addition, our proposed workflow could provide a considerable volume of self-labeled datasets that can be used for model benchmarking and various downstream visual tasks, including fossil classification and segmentation. We further performed, for the first time, a few-shot semantic segmentation of different foraminifera chambers on both the generated and synthetic images with high accuracy. This novel meta-learning approach is only possible when a high-resolution and high-volume labeled dataset is available. Therefore, our proposed deep learning-based workflow is promising and shows a potential to advance and optimize micropaleontological research and other visual-dependent geological analysis.
Keywords