Frontiers in Physiology (Aug 2022)

Self-supervised learning for macromolecular structure classification based on cryo-electron tomograms

  • Tarun Gupta,
  • Xuehai He,
  • Mostofa Rafid Uddin,
  • Xiangrui Zeng,
  • Andrew Zhou,
  • Jing Zhang,
  • Zachary Freyberg,
  • Min Xu

DOI
https://doi.org/10.3389/fphys.2022.957484
Journal volume & issue
Vol. 13

Abstract

Read online

Macromolecular structure classification from cryo-electron tomography (cryo-ET) data is important for understanding macro-molecular dynamics. It has a wide range of applications and is essential in enhancing our knowledge of the sub-cellular environment. However, a major limitation has been insufficient labelled cryo-ET data. In this work, we use Contrastive Self-supervised Learning (CSSL) to improve the previous approaches for macromolecular structure classification from cryo-ET data with limited labels. We first pretrain an encoder with unlabelled data using CSSL and then fine-tune the pretrained weights on the downstream classification task. To this end, we design a cryo-ET domain-specific data-augmentation pipeline. The benefit of augmenting cryo-ET datasets is most prominent when the original dataset is limited in size. Overall, extensive experiments performed on real and simulated cryo-ET data in the semi-supervised learning setting demonstrate the effectiveness of our approach in macromolecular labeling and classification.

Keywords