Temporal dependency modeling for improved medical image segmentation: The R-UNet perspective

Ahmed Alweshah; Roohollah Barzamini; Farshid Hajati; Shoorangiz Shams Shamsabad Farahani; Mohammad Arabian; Behnaz Sohani

Franklin Open (Dec 2024)

Temporal dependency modeling for improved medical image segmentation: The R-UNet perspective

Ahmed Alweshah,
Roohollah Barzamini,
Farshid Hajati,
Shoorangiz Shams Shamsabad Farahani,
Mohammad Arabian,
Behnaz Sohani

Affiliations

Ahmed Alweshah: Department of Electrical Engineering, Islamic Azad University Science and Research Branch, Tehran, Iran
Roohollah Barzamini: Department of Electrical Engineering, Central Tehran Branch, Islamic Azad University, Tehran 1148963537, Iran
Farshid Hajati: School of Science and Technology, Faculty of Science, Agriculture, Business and Law, University of New England, Armidale, NSW 2350, Australia
Shoorangiz Shams Shamsabad Farahani: Department of Electrical Engineering, Islamshahr branch, Islamic Azad University, Islamshahr, Iran
Mohammad Arabian: Department of Electrical Engineering, Central Tehran Branch, Islamic Azad University, Tehran 1148963537, Iran
Behnaz Sohani: Wolfson School of Mechanical, Electrical & Manufacturing Engineering, Loughborough University, Loughborough, Leicestershire, LE11 3TU, UK; Corresponding author.

Journal volume & issue: Vol. 9
p. 100182

Abstract

Read online

In this study, we propose a modified version of the widely used UNet architecture, enhanced by the integration of recurrent blocks at each step of the encoder (down-sampling) and decoder (up-sampling) stages. The proposed Recurrent UNet (R-UNet) architecture aims to improve the performance of semantic segmentation tasks by allowing the model to capture temporal dependencies and long-range contextual information. The R-UNet architecture consists of two main components: a recurrent encoder and a recurrent decoder. The recurrent encoder is composed of a series of convolutional and recurrent blocks, which extract features from the input image and propagate them across time. The recurrent decoder consists of a similar series of convolutional and recurrent blocks, which use the extracted features to generate the final segmentation mask. An attention mechanism is employed to enhance feature extraction at the bottleneck of the model. The proposed R-UNet architecture is evaluated on multiple benchmark datasets, including those for liver segmentation, brain tumor detection, mitochondria segmentation, lung imaging, a proprietary lung CT COVID-19 dataset, as well as various multi-organ imaging datasets. The experimental results demonstrate that the proposed R-UNet architecture outperforms the standard UNet architecture and several other state-of-the-art semantic segmentation models in terms of accuracy score, achieving an overall accuracy of 97.2 % on the Mitochondria dataset, 97.83 % on the Liver dataset, 89.17 % on the Tumor dataset and 97.22 % Lung dataset.

Published in Franklin Open

ISSN: 2773-1871 (Print); 2773-1863 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Technology
Website: https://www.sciencedirect.com/journal/franklin-open

About the journal

Abstract

Keywords