PLoS ONE (Jan 2024)

Numerical stability of DeepGOPlus inference.

  • Inés Gonzalez Pepe,
  • Yohan Chatelain,
  • Gregory Kiar,
  • Tristan Glatard

DOI
https://doi.org/10.1371/journal.pone.0296725
Journal volume & issue
Vol. 19, no. 1
p. e0296725

Abstract

Read online

Convolutional neural networks (CNNs) are currently among the most widely-used deep neural network (DNN) architectures available and achieve state-of-the-art performance for many problems. Originally applied to computer vision tasks, CNNs work well with any data with a spatial relationship, besides images, and have been applied to different fields. However, recent works have highlighted numerical stability challenges in DNNs, which also relates to their known sensitivity to noise injection. These challenges can jeopardise their performance and reliability. This paper investigates DeepGOPlus, a CNN that predicts protein function. DeepGOPlus has achieved state-of-the-art performance and can successfully take advantage and annotate the abounding protein sequences emerging in proteomics. We determine the numerical stability of the model's inference stage by quantifying the numerical uncertainty resulting from perturbations of the underlying floating-point data. In addition, we explore the opportunity to use reduced-precision floating point formats for DeepGOPlus inference, to reduce memory consumption and latency. This is achieved by instrumenting DeepGOPlus' execution using Monte Carlo Arithmetic, a technique that experimentally quantifies floating point operation errors and VPREC, a tool that emulates results with customizable floating point precision formats. Focus is placed on the inference stage as it is the primary deliverable of the DeepGOPlus model, widely applicable across different environments. All in all, our results show that although the DeepGOPlus CNN is very stable numerically, it can only be selectively implemented with lower-precision floating-point formats. We conclude that predictions obtained from the pre-trained DeepGOPlus model are very reliable numerically, and use existing floating-point formats efficiently.