Baghdad Science Journal (Aug 2023)
Deep Learning-based Predictive Model of mRNA Vaccine Deterioration: An Analysis of the Stanford COVID-19 mRNA Vaccine Dataset
Abstract
The emergence of SARS-CoV-2, the virus responsible for the COVID-19 pandemic, has resulted in a global health crisis leading to widespread illness, death, and daily life disruptions. Having a vaccine for COVID-19 is crucial to controlling the spread of the virus which will help to end the pandemic and restore normalcy to society. Messenger RNA (mRNA) molecules vaccine has led the way as the swift vaccine candidate for COVID-19, but it faces key probable restrictions including spontaneous deterioration. To address mRNA degradation issues, Stanford University academics and the Eterna community sponsored a Kaggle competition.This study aims to build a deep learning (DL) model which will predict deterioration rates at each base of the mRNA molecule. A sequence DL model based on a bidirectional gated recurrent unit (GRU) is implemented. The model is applied to the Stanford COVID-19 mRNA vaccine dataset to predict the mRNA sequences deterioration by predicting five reactivity values for every base in the sequence, namely reactivity values, deterioration rates at high pH, at high temperature, at high pH with Magnesium, and at high temperature with Magnesium. The Stanford COVID-19 mRNA vaccine dataset is split into the training set, validation set, and test set. The bidirectional GRU model minimizes the mean column wise root mean squared error (MCRMSE) of deterioration rates at each base of the mRNA sequence molecule with a value of 0.32086 for the test set which outperformed the winning models with a margin of (0.02112). This study would help other researchers better understand how to forecast mRNA sequence molecule properties to develop a stable COVID-19 vaccine.
Keywords