Process control of mAb production using multi-actor proximal policy optimization

Nikita Gupta; Shikhar Anand; Tanuja Joshi; Deepak Kumar; Manojkumar Ramteke; Hariprasad Kodamana

Digital Chemical Engineering (Sep 2023)

Process control of mAb production using multi-actor proximal policy optimization

Nikita Gupta,
Shikhar Anand,
Tanuja Joshi,
Deepak Kumar,
Manojkumar Ramteke,
Hariprasad Kodamana

Affiliations

Nikita Gupta: Department of Chemical Engineering, IIT Delhi, India
Shikhar Anand: Department of Chemical Engineering, IIT Delhi, India
Tanuja Joshi: Department of Chemical Engineering, IIT Delhi, India
Deepak Kumar: Department of Chemical Engineering, IIT Delhi, India
Manojkumar Ramteke: Department of Chemical Engineering, IIT Delhi, India; Yardi School of Artificial Intelligence, IIT Delhi, India
Hariprasad Kodamana: Department of Chemical Engineering, IIT Delhi, India; Yardi School of Artificial Intelligence, IIT Delhi, India; Corresponding author at: Department of Chemical Engineering, IIT Delhi, India.

Journal volume & issue: Vol. 8
p. 100108

Abstract

Read online

Monoclonal antibodies (mAb) are biopharmaceutical products that improve human immunity. In this work, we propose a multi-actor proximal policy optimization-based reinforcement learning (RL) for the control of mAb production. Here, manipulated variable is flowrate and the control variable is mAb concentration. Based on root mean square error (RMSE) values and convergence performance, it has been observed that multi-actor PPO has performed better as compared to other RL algorithms. It is observed that PPO predicts a 40 % reduction in the number of days to reach the desired concentration. Moreover, the performance of PPO is improved as the number of actors increases. PPO agent shows the best performance with three actors, but on further increasing, its performance deteriorated. These results are verified based on three case studies, namely, (i) for nominal conditions, (ii) in the presence of noise in raw materials and measurements, and (iii) in the presence of stochastic disturbance in temperature and noise in measurements. The results indicate that the proposed approach outperforms the deep deterministic policy gradient (DDPG), twin delayed deep deterministic policy gradient (TD3), and proximal policy optimization (PPO) algorithms for the control of the bioreactor system.

Published in Digital Chemical Engineering

ISSN: 2772-5081 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Technology: Chemical technology: Chemical engineering; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://www.journals.elsevier.com/digital-chemical-engineering

About the journal

Abstract

Keywords