Machine-learning-based predictive classifier for bone marrow failure syndrome using complete blood count data
Jeongmin Seo,
Chansub Lee,
Youngil Koh,
Choong Hyun Sun,
Jong-Mi Lee,
Hong Yul An,
Myungshin Kim
Affiliations
Jeongmin Seo
Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea; Department of Internal Medicine, Seoul National University Bundang Hospital, Seongnam-si, Gyeonggi-do, Republic of Korea
Chansub Lee
NOBO Medicine Inc., Seoul, Republic of Korea
Youngil Koh
Department of Internal Medicine, Seoul National University Hospital, Seoul, Republic of Korea; NOBO Medicine Inc., Seoul, Republic of Korea; Center for Precision Medicine, Seoul National University Hospital, Seoul, Republic of Korea
Choong Hyun Sun
NOBO Medicine Inc., Seoul, Republic of Korea
Jong-Mi Lee
Department of Laboratory Medicine, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea; Catholic Genetic Laboratory Centre, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
Hong Yul An
NOBO Medicine Inc., Seoul, Republic of Korea; Corresponding author
Myungshin Kim
Department of Laboratory Medicine, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea; Catholic Genetic Laboratory Centre, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea; Corresponding author
Summary: Accurate risk assessment of bone marrow failure syndrome (BMFS) is crucial for early diagnosis and intervention. Interpreting complete blood count (CBC) data is challenging without hematological expertise. To support primary physicians, we developed a predictive model using basic demographics and CBC data collected retrospectively from two major hospitals in South Korea. Binary classifiers for aplastic anemia and myelodysplastic syndrome were created and combined to form a BMFS classifier. The model demonstrated high performance in distinguishing BMFS, with consistent results across different CBC feature sets, confirmed by external validation. This algorithm provides a practical guide for primary physicians to identify BMFS based on initial CBC data, aiding in effective triage, timely referrals, and improved patient care.