PLoS ONE (Jan 2016)

A Systematic Analysis of the Structures of Heterologously Expressed Proteins and Those from Their Native Hosts in the RCSB PDB Archive.

  • Ren-Bin Zhou,
  • Hui-Meng Lu,
  • Jie Liu,
  • Jian-Yu Shi,
  • Jing Zhu,
  • Qin-Qin Lu,
  • Da-Chuan Yin

DOI
https://doi.org/10.1371/journal.pone.0161254
Journal volume & issue
Vol. 11, no. 8
p. e0161254

Abstract

Read online

Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms.