Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology

Farit M. Afendi; Naoaki Ono; Latifah K. Darusman; Kensuke Nakamura; Yukiko Nakamura; Nelson Kibinge; Aki Hirai Morita; Hisayuki Horai; Md. Altaf-Ul-Amin; Shigehiko Kanaya; Ken Tanaka

Computational and Structural Biotechnology Journal (Jan 2013)

Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology

Farit M. Afendi,
Naoaki Ono,
Latifah K. Darusman,
Kensuke Nakamura,
Yukiko Nakamura,
Nelson Kibinge,
Aki Hirai Morita,
Hisayuki Horai,
Md. Altaf-Ul-Amin,
Shigehiko Kanaya,
Ken Tanaka

Affiliations

Farit M. Afendi
Naoaki Ono
Latifah K. Darusman
Kensuke Nakamura
Yukiko Nakamura
Nelson Kibinge
Aki Hirai Morita
Hisayuki Horai
Md. Altaf-Ul-Amin
Shigehiko Kanaya
Ken Tanaka

Journal volume & issue: Vol. 4, no. 5
p. e201301010

Abstract

Read online

Molecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The present study reviews the usage of KNApSAcK Family DB in metabolomics and related area, discusses several statistical methods for handling multivariate data and shows their application on Indonesian blended herbal medicines (Jamu) as a case study. Exploration using Biplot reveals many plants are rarely utilized while some plants are highly utilized toward specific efficacy. Furthermore, the ingredients of Jamu formulas are modeled using Partial Least Squares Discriminant Analysis (PLS-DA) in order to predict their efficacy. The plants used in each Jamu medicine served as the predictors, whereas the efficacy of each Jamu provided the responses. This model produces 71.6% correct classification in predicting efficacy. Permutation test then is used to determine plants that serve as main ingredients in Jamu formula by evaluating the significance of the PLS-DA coefficients. Next, in order to explain the role of plants that serve as main ingredients in Jamu medicines, information of pharmacological activity of the plants is added to the predictor block. Then N-PLS-DA model, multiway version of PLS-DA, is utilized to handle the three-dimensional array of the predictor block. The resulting N-PLS-DA model reveals that the effects of some pharmacological activities are specific for certain efficacy and the other activities are diverse toward many efficacies. Mathematical modeling introduced in the present study can be utilized in global analysis of big data targeting to reveal the underlying biology.

Published in Computational and Structural Biotechnology Journal

ISSN: 2001-0370 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Technology: Chemical technology: Biotechnology
Website: https://www.sciencedirect.com/journal/computational-and-structural-biotechnology-journal

About the journal