Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms

Bin Huang; Lupeng Kong; Chao Wang; Fusong Ju; Qi Zhang; Jianwei Zhu; Tiansu Gong; Haicang Zhang; Chungong Yu; Wei-Mou Zheng; Dongbo Bu

doi:10.1016/j.gpb.2022.11.014

Genomics, Proteomics & Bioinformatics (Oct 2023)

Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms

Bin Huang,
Lupeng Kong,
Chao Wang,
Fusong Ju,
Qi Zhang,
Jianwei Zhu,
Tiansu Gong,
Haicang Zhang,
Chungong Yu,
Wei-Mou Zheng,
Dongbo Bu

Affiliations

Bin Huang: Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China
Lupeng Kong: Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; Changping Laboratory, Beijing 102206, China
Chao Wang: Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Fusong Ju: Microsoft Research AI4Science, Beijing 100080, China
Qi Zhang: Huawei Noah’s Ark Lab, Wuhan 430206, China
Jianwei Zhu: Microsoft Research AI4Science, Beijing 100080, China
Tiansu Gong: Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China
Haicang Zhang: Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; Zhongke Big Data Academy, Zhengzhou 450046, China; Corresponding authors.
Chungong Yu: Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; Zhongke Big Data Academy, Zhengzhou 450046, China; Corresponding authors.
Wei-Mou Zheng: Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China; Corresponding authors.
Dongbo Bu: Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China; University of Chinese Academy of Sciences, Beijing 100049, China; Zhongke Big Data Academy, Zhengzhou 450046, China; Corresponding authors.

DOI: https://doi.org/10.1016/j.gpb.2022.11.014
Journal volume & issue: Vol. 21, no. 5
pp. 913 – 925

Abstract

Read online

Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields, including biochemistry, medicine, physics, mathematics, and computer science. These researchers adopt various research paradigms to attack the same structure prediction problem: biochemists and physicists attempt to reveal the principles governing protein folding; mathematicians, especially statisticians, usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure, while computer scientists formulate protein structure prediction as an optimization problem — finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure. These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman, namely, data modeling and algorithmic modeling. Recently, we have also witnessed the great success of deep learning in protein structure prediction. In this review, we present a survey of the efforts for protein structure prediction. We compare the research paradigms adopted by researchers from different fields, with an emphasis on the shift of research paradigms in the era of deep learning. In short, the algorithmic modeling techniques, especially deep neural networks, have considerably improved the accuracy of protein structure prediction; however, theories interpreting the neural networks and knowledge on protein folding are still highly desired.

Published in Genomics, Proteomics & Bioinformatics

ISSN: 1672-0229 (Print); 2210-3244 (Online)
Publisher: Oxford University Press
Country of publisher: United Kingdom
LCC subjects: Science: Biology (General); Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://academic.oup.com/gpb

About the journal

Abstract

Keywords