IEEE Access (Jan 2022)

A Convolutional Neural Network-Based Framework for Classification of Protein Localization Using Confocal Microscopy Images

  • Sonam Aggarwal,
  • Sheifali Gupta,
  • Ramani Kannan,
  • Rakesh Ahuja,
  • Deepali Gupta,
  • Sapna Juneja,
  • Samir Brahim Belhaouari

DOI
https://doi.org/10.1109/ACCESS.2022.3197189
Journal volume & issue
Vol. 10
pp. 83591 – 83611

Abstract

Read online

Understanding protein subcellular localization is vital and indispensable in proteomics research. Molecular biology and computer science developments have enabled the use of computational approaches to identify proteins in cells. An excellent method for locating proteins is confocal microscopy, used by the Human Protein Atlas (HPA). By categorizing human proteins, it can assist researchers in better comprehending human pathophysiology and assist doctors in automating medical image interpretation. Human protein Atlas comprises millions of images annotated with single or multiple labels. However, only a few methods for automated prediction of protein localization have been developed, and they mostly concentrate on single-label classification. Therefore, a recognition system for multi-label classification of HPA with acceptable performance should be developed. Hence, this study aims to develop a deep learning-based system for the multi-label classification of HPA. Specifically, two architectures have been proposed in this work for automatically extracting features from the images and predicting the localization of the proteins in 28 subcellular compartments. First, a convolutional neural network has been proposed, which has been trained from scratch and second an ensemble-based model using transfer learning architectures has been proposed. The results demonstrate that both models are effective in classifying proteins according to their location in the major cellular organelles. Yet, in this study, the proposed convolutional network outperforms the ensemble model in classification of images with multiple simultaneous protein localizations. Three performance metrics standards—recall, accuracy, and f1-score—were used to assess the models. The proposed convolutional neural network beats the ensemble model by achieving recall of 0.75, precision of 0.75 and f1-score of 0.74.

Keywords