Big Data and Cognitive Computing (Jul 2019)
Breast Cancer Diagnosis System Based on Semantic Analysis and Choquet Integral Feature Selection for High Risk Subjects
Abstract
In this work, we build a computer aided diagnosis (CAD) system of breast cancer for high risk patients considering the breast imaging reporting and data system (BIRADS), mapping main expert concepts and rules. Therefore, a bag of words is built based on the ontology of breast cancer analysis. For a more reliable characterization of the lesion, a feature selection based on Choquet integral is applied aiming at discarding the irrelevant descriptors. Then, a set of well-known machine learning tools are used for semantic annotation to fill the gap between low level knowledge and expert concepts involved in the BIRADS classification. Indeed, expert rules are implicitly modeled using a set of classifiers for severity diagnosis. As a result, the feature selection gives a a better assessment of the lesion and the semantic analysis context offers an attractive frame to include external factors and meta-knowledge, as well as exploiting more than one modality. Accordingly, our CAD system is intended for diagnosis of breast cancer for high risk patients. It has been then validated based on two complementary modalities, MRI and dual energy contrast enhancement mammography (DECEDM), the proposed system leads a correct classification rate of 99%.
Keywords