Data in Brief (Aug 2024)

Dataset for discovering new hypertension small molecules using machine learning-aided computational fragment-based design

  • Odifentse Mapula-e Lehasa,
  • Uche A.K. Chude-Okonkwo

Journal volume & issue
Vol. 55
p. 110677

Abstract

Read online

This dataset demonstrates the use of computational fragmentation-based and machine learning-aided drug discovery to generate new lead molecules for the treatment of hypertension. Specifically, the focus is on agents targeting the renin-angiotensin-aldosterone system (RAAS), commonly classified as Angiotensin-Converting Enzyme Inhibitors (ACEIs) and Angiotensin II Receptor Blockers (ARBs). The preliminary dataset was a target-specific, user-generated fragment library of 63 molecular fragments of the 26 approved ACEI and ARB molecules obtained from the ChEMBL and DrugBank molecular databases. This fragment library provided the primary input dataset to generate the new lead molecules presented in the dataset. The newly generated molecules were screened to check whether they met the criteria for oral drugs and comprised the ACEI or ARB core functional group criterion. Using unsupervised machine learning, the molecules that met the criterion were divided into clusters of drug classes based on their functional group allocation. This process led to three final output datasets, one containing the new ACEI molecules, another for the new ARB molecules, and the last for the new unassigned class molecules. This data can aid in the timely and efficient design of novel antihypertensive drugs. It can also be used in precision hypertension medicine for patients with treatment resistance, non-response or co-morbidities. Although this dataset is specific to antihypertensive agents, the model can be reused with minimal changes to produce new lead molecules for other health conditions.

Keywords