PeerJ Computer Science (Sep 2021)
On the classification of simple and complex biological images using Krawtchouk moments and Generalized pseudo-Zernike moments: a case study with fly wing images and breast cancer mammograms
Abstract
In image analysis, orthogonal moments are useful mathematical transformations for creating new features from digital images. Moreover, orthogonal moment invariants produce image features that are resistant to translation, rotation, and scaling operations. Here, we show the result of a case study in biological image analysis to help researchers judge the potential efficacy of image features derived from orthogonal moments in a machine learning context. In taxonomic classification of forensically important flies from the Sarcophagidae and the Calliphoridae family (n = 74), we found the GUIDE random forests model was able to completely classify samples from 15 different species correctly based on Krawtchouk moment invariant features generated from fly wing images, with zero out-of-bag error probability. For the more challenging problem of classifying breast masses based solely on digital mammograms from the CBIS-DDSM database (n = 1,151), we found that image features generated from the Generalized pseudo-Zernike moments and the Krawtchouk moments only enabled the GUIDE kernel model to achieve modest classification performance. However, using the predicted probability of malignancy from GUIDE as a feature together with five expert features resulted in a reasonably good model that has mean sensitivity of 85%, mean specificity of 61%, and mean accuracy of 70%. We conclude that orthogonal moments have high potential as informative image features in taxonomic classification problems where the patterns of biological variations are not overly complex. For more complicated and heterogeneous patterns of biological variations such as those present in medical images, relying on orthogonal moments alone to reach strong classification performance is unrealistic, but integrating prediction result using them with carefully selected expert features may still produce reasonably good prediction models.
Keywords