Classifying rock types by geostatistics and random forests in tandem

Parag Jyoti Dutta; Xavier Emery

doi:10.1088/2632-2153/ad3c0f

Machine Learning: Science and Technology (Jan 2024)

Classifying rock types by geostatistics and random forests in tandem

Parag Jyoti Dutta,
Xavier Emery

Affiliations

Parag Jyoti Dutta: ORCiD; Department of Geology, Cotton University , Guwahati 781001, India
Xavier Emery: ORCiD; Department of Mining Engineering & Advanced Mining Technology Center, Universidad de Chile , Santiago 8370448, Chile

DOI: https://doi.org/10.1088/2632-2153/ad3c0f
Journal volume & issue: Vol. 5, no. 2
p. 025013

Abstract

Read online

Rock type classification is crucial for evaluating mineral resources in ore deposits and for rock mechanics. Mineral deposits are formed in a variety of rock bodies and rock types. However, the rock type identification in drill core samples is often complicated by overprinting and weathering processes. An approach to classifying rock types from drill core data relies on whole-rock geochemical assays as features. There are few studies on rock type classification from a limited number of metal grades and dry bulk density as features. The novelty in our approach is the introduction of two sets of feature variables (proxies) at sampled data points, generated by geostatistical leave-one-out cross-validation and by kriging for removing short-scale spatial variation of the measured features. We applied our proposal to a dataset from a porphyry Cu–Au deposit in Mongolia. The model performances on a testing data subset indicate that, when the training dataset is not large, the performance of the classifier (a random forest) substantially improves by incorporating the proxy features as a complement to the original measured features. At each training data point, these proxy features throw light based on the underlying spatial data correlation structure, scales of variations, sampling design, and values of features observed at neighboring points, and show the benefits of combining geostatistics with machine learning.

Published in Machine Learning: Science and Technology

ISSN: 2632-2153 (Online)
Publisher: IOP Publishing
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://iopscience.iop.org/journal/2632-2153

About the journal

Abstract

Keywords