In silico proof of principle of machine learning-based antibody design at unconstrained scale

Rahmad Akbar; Philippe A. Robert; Cédric R. Weber; Michael Widrich; Robert Frank; Milena Pavlović; Lonneke Scheffer; Maria Chernigovskaya; Igor Snapkov; Andrei Slabodkin; Brij Bhushan Mehta; Enkelejda Miho; Fridtjof Lund-Johansen; Jan Terje Andersen; Sepp Hochreiter; Ingrid Hobæk Haff; Günter Klambauer; Geir Kjetil Sandve; Victor Greiff

doi:10.1080/19420862.2022.2031482

mAbs (Dec 2022)

In silico proof of principle of machine learning-based antibody design at unconstrained scale

Rahmad Akbar,
Philippe A. Robert,
Cédric R. Weber,
Michael Widrich,
Robert Frank,
Milena Pavlović,
Lonneke Scheffer,
Maria Chernigovskaya,
Igor Snapkov,
Andrei Slabodkin,
Brij Bhushan Mehta,
Enkelejda Miho,
Fridtjof Lund-Johansen,
Jan Terje Andersen,
Sepp Hochreiter,
Ingrid Hobæk Haff,
Günter Klambauer,
Geir Kjetil Sandve,
Victor Greiff

Affiliations

Rahmad Akbar: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Philippe A. Robert: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Cédric R. Weber: Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
Michael Widrich: Ellis Unit Linz and Lit Ai Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
Robert Frank: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Milena Pavlović: Department of Informatics, University of Oslo, Oslo, Norway
Lonneke Scheffer: Department of Informatics, University of Oslo, Oslo, Norway
Maria Chernigovskaya: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Igor Snapkov: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Andrei Slabodkin: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Brij Bhushan Mehta: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Enkelejda Miho: Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
Fridtjof Lund-Johansen: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Jan Terje Andersen: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway
Sepp Hochreiter: Ellis Unit Linz and Lit Ai Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
Ingrid Hobæk Haff: Department of Mathematics, University of Oslo, Oslo, Norway
Günter Klambauer: Ellis Unit Linz and Lit Ai Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
Geir Kjetil Sandve: Department of Informatics, University of Oslo, Oslo, Norway
Victor Greiff: Department of Immunology, Oslo University Hospital Rikshospitalet and University of Oslo, Norway

DOI: https://doi.org/10.1080/19420862.2022.2031482
Journal volume & issue: Vol. 14, no. 1

Abstract

Read online

Generative machine learning (ML) has been postulated to become a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody-binding parameters. The simulation framework enables the computation of synthetic antibody-antigen 3D-structures, and it functions as an oracle for unrestricted prospective evaluation and benchmarking of antibody design parameters of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (one dimensional: 1D) data can be used to design conformational (three dimensional: 3D) epitope-specific antibodies, matching, or exceeding the training dataset in affinity and developability parameter value variety. Furthermore, we established a lower threshold of sequence diversity necessary for high-accuracy generative antibody ML and demonstrated that this lower threshold also holds on experimental real-world data. Finally, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.

Published in mAbs

ISSN: 1942-0862 (Print); 1942-0870 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Medicine: Therapeutics. Pharmacology; Medicine: Internal medicine: Specialties of internal medicine: Immunologic diseases. Allergy
Website: https://www.tandfonline.com/journals/kmab

About the journal

Abstract

Keywords