PLoS ONE (Jan 2023)
Linear programming based computational technique for leukemia classification using gene expression profile.
Abstract
Cancer is a serious public health concern worldwide and is the leading cause of death. Blood cancer is one of the most dangerous types of cancer. Leukemia is a type of cancer that affects the blood cell and bone marrow. Acute leukemia is a chronic condition that is fatal if left untreated. A timely, reliable, and accurate diagnosis of leukemia at an early stage is critical to treating and preserving patients' lives. There are four types of leukemia, namely acute lymphocytic leukemia, acute myelogenous leukemia, chronic lymphocytic in extracting, and chronic myelogenous leukemia. Recognizing these cancerous development cells is often done via manual analysis of microscopic images. This requires an extraordinarily skilled pathologist. Leukemia symptoms might include lethargy, a lack of energy, a pale complexion, recurrent infections, and easy bleeding or bruising. One of the challenges in this area is identifying subtypes of leukemia for specialized treatment. This Study is carried out to increase the precision of diagnosis to assist in the development of personalized plans for treatment, and improve general leukemia-related healthcare practises. In this research, we used leukemia gene expression data from Curated Microarray Database (CuMiDa). Microarrays are ideal for studying cancer, however, categorizing the expression pattern of microarray information can be challenging. This proposed study uses feature selection methods and machine learning techniques to predict and classify subtypes of leukemia in gene expression data CuMiDa (GSE9476). This research work utilized linear programming (LP) as a machine-learning technique for classification. Linear programming model classifies and predicts the subtypes of leukemia Bone_Marrow_CD34, Bone Marrow, AML, PB, and PBSC CD34. Before using the LP model, we selected 25 features from the given dataset of 22283 features. These 25 significant features were the most distinguishing for classification. The classification accuracy of this work is 98.44%.