Whole Spine Segmentation Using Object Detection and Semantic Segmentation
Raffaele Da Mutten,
Olivier Zanier,
Sven Theiler,
Seung-Jun Ryu,
Luca Regli,
Carlo Serra,
Victor E. Staartjes
Affiliations
Raffaele Da Mutten
Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zürich, University of Zürich, Zürich, Switzerland
Olivier Zanier
Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zürich, University of Zürich, Zürich, Switzerland
Sven Theiler
Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zürich, University of Zürich, Zürich, Switzerland
Seung-Jun Ryu
Department of Neurosurgery, Daejeon Eulji University Hospital, Eulji University Medical School, Daejeon, Korea
Luca Regli
Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zürich, University of Zürich, Zürich, Switzerland
Carlo Serra
Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zürich, University of Zürich, Zürich, Switzerland
Victor E. Staartjes
Machine Intelligence in Clinical Neuroscience (MICN) Laboratory, Department of Neurosurgery, Clinical Neuroscience Center, University Hospital Zürich, University of Zürich, Zürich, Switzerland
Objective Virtual and augmented reality have enjoyed increased attention in spine surgery. Preoperative planning, pedicle screw placement, and surgical training are among the most studied use cases. Identifying osseous structures is a key aspect of navigating a 3-dimensional virtual reconstruction. To automate the otherwise time-consuming process of labeling vertebrae on each slice individually, we propose a fully automated pipeline that automates segmentation on computed tomography (CT) and which can form the basis for further virtual or augmented reality application and radiomic analysis. Methods Based on a large public dataset of annotated vertebral CT scans, we first trained a YOLOv8m (You-Only-Look-Once algorithm, Version 8 and size medium) to detect each vertebra individually. On the then cropped images, a 2D-U-Net was developed and externally validated on 2 different public datasets. Results Two hundred fourteen CT scans (cervical, thoracic, or lumbar spine) were used for model training, and 40 scans were used for external validation. Vertebra recognition achieved a mAP50 (mean average precision with Jaccard threshold of 0.5) of over 0.84, and the segmentation algorithm attained a mean Dice score of 0.75 ± 0.14 at internal, 0.77 ± 0.12 and 0.82 ± 0.14 at external validation, respectively. Conclusion We propose a 2-stage approach consisting of single vertebra labeling by an object detection algorithm followed by semantic segmentation. In our externally validated pilot study, we demonstrate robust performance for our object detection network in identifying individual vertebrae, as well as for our segmentation model in precisely delineating the bony structures.