Scientific Reports (Jul 2024)

Colorectal cancer prognosis based on dietary pattern using synthetic minority oversampling technique with K-nearest neighbors approach

  • S. Thanga Prasath,
  • C. Navaneethan

DOI
https://doi.org/10.1038/s41598-024-67848-3
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 20

Abstract

Read online

Abstract Generally, a person’s life span depends on their food consumption because it may cause deadly diseases like colorectal cancer (CRC). In 2020, colorectal cancer accounted for one million fatalities globally, representing 10% of all cancer casualties. 76,679 males and 78,213 females over the age of 59 from ten states in the United States participated in this analysis. During follow-up, 1378 men and 981 women were diagnosed with colon cancer. This prospective cohort study used 231 food items and their variants as input features to identify CRC patients. Before labelling any foods as colorectal cancer-causing foods, it is ethical to analyse facts like how many grams of food should be consumed daily and how many times a week. This research examines five classification algorithms on real-time datasets: K-Nearest Neighbour (KNN), Decision Tree (DT), Random Forest (RF), Logistic Regression with Classifier Chain (LRCC), and Logistic Regression with Label Powerset (LRLC). Then, the SMOTE algorithm is applied to deal with and identify imbalances in the data. Our study shows that eating more than 10 g/d of low-fat butter in bread (RR 1.99, CI 0.91–4.39) and more than twice a week (RR 1.49, CI 0.93–2.38) increases CRC risk. Concerning beef, eating in excess of 74 g of beef steak daily (RR 0.88, CI 0.50–1.55) and having it more than once a week (RR 0.88, CI 0.62–1.23) decreases the risk of CRC, respectively. While eating beef and dairy products in a daily diet should be cautious about quantity. Consuming those items in moderation on a regular basis will protect us against CRC risk. Meanwhile, a high intake of poultry (RR 0.2, CI 0.05–0.81), fish (RR 0.82, CI 0.31–2.16), and pork (RR 0.67, CI 0.17–2.65) consumption negatively correlates to CRC hazards.

Keywords