Acta Informatica Pragensia (Aug 2024)
Advancements in Breast Cancer Diagnosis: A Comprehensive Review of Mammography Datasets, Preprocessing and Classification Techniques
Abstract
Breast cancer, a pervasive global health concern, necessitates early detection for an improved prognosis. Mammography, a pivotal screening tool, faces challenges in interpretation, motivating the integration of advanced computational models. This paper offers a comprehensive examination of breast cancer classification through mammography, focusing on machine learning (ML) and deep learning (DL) approaches. The discussion encompasses widely used mammography datasets, preprocessing techniques, data augmentation and diverse classification algorithms. Noteworthy datasets include LAMIS-DMDB, EMBED and INbreast. Preprocessing involves denoising and contrast enhancement, employing techniques such as Wiener filtering and histogram equalization. Data augmentation, a critical factor in handling small datasets, is explored using basic and advanced techniques, including generative adversarial networks. ML algorithms analyse entire mammograms, while DL techniques, notably convolutional neural networks, focus on localized regions of interest. Despite promising strides, challenges persist in obtaining high-quality datasets and ensuring model interpretability, as well as the strong similarities between cancer and non-cancer regions and irrelevant feature extraction. The paper concludes by outlining potential research directions to further transform breast cancer prognosis and treatment.
Keywords