Frontiers in Cardiovascular Medicine (Apr 2024)

CHD-CXR: a de-identified publicly available dataset of chest x-ray for congenital heart disease

  • Li Zhixin,
  • Luo Gang,
  • Ji Zhixian,
  • Wang Sibao,
  • Pan Silin

DOI
https://doi.org/10.3389/fcvm.2024.1351965
Journal volume & issue
Vol. 11

Abstract

Read online

Congenital heart disease is a prevalent birth defect, accounting for approximately one-third of major birth defects. The challenge lies in early detection, especially in underdeveloped medical regions where a shortage of specialized physicians often leads to oversight. While standardized chest x-rays can assist in diagnosis and treatment, their effectiveness is limited by subtle cardiac manifestations. However, the emergence of deep learning in computer vision has paved the way for detecting subtle changes in chest x-rays, such as lung vessel density, enabling the detection of congenital heart disease in children. This highlights the need for further investigation. The lack of expert-annotated, high-quality medical image datasets hinders the progress of medical image artificial intelligence. In response, we have released a dataset containing 828 DICOM chest x-ray files from children with diagnosed congenital heart disease, alongside corresponding cardiac ultrasound reports. This dataset emphasizes complex structural characteristics, facilitating the transition from machine learning to machine teaching in deep learning. To ascertain the dataset's applicability, we trained a preliminary model and achieved an area under the receiver operating characteristic curve (ROC 0.85). We provide detailed introductions and publicly available datasets at: https://www.kaggle.com/competitions/congenital-heart-disease.

Keywords