PLoS ONE (Jan 2022)
A novel computational strategy for defining the minimal protein molecular surface representation
Abstract
Most proteins perform their biological function by interacting with one or more molecular partners. In this respect, characterizing local features of the molecular surface, that can potentially be involved in the interaction with other molecules, represents a step forward in the investigation of the mechanisms of recognition and binding between molecules. Predictive methods often rely on extensive samplings of molecular patches with the aim to identify hot spots on the surface. In this framework, analysis of large proteins and/or many molecular dynamics frames is often unfeasible due to the high computational cost. Thus, finding optimal ways to reduce the number of points to be sampled maintaining the biological information (including the surface shape) carried by the molecular surface is pivotal. In this perspective, we here present a new theoretical and computational algorithm with the aim of defining a set of molecular surfaces composed of points not uniformly distributed in space, in such a way as to maximize the information of the overall shape of the molecule by minimizing the number of total points. We test our procedure’s ability in recognizing hot-spots by describing the local shape properties of portions of molecular surfaces through a recently developed method based on the formalism of 2D Zernike polynomials. The results of this work show the ability of the proposed algorithm to preserve the key information of the molecular surface using a reduced number of points compared to the complete surface, where all points of the surface are used for the description. In fact, the methodology shows a significant gain of the information stored in the sampling procedure compared to uniform random sampling.