Divisive hierarchical maximum likelihood clustering

Alok Sharma; Yosvany López; Tatsuhiko Tsunoda

doi:10.1186/s12859-017-1965-5

BMC Bioinformatics (Dec 2017)

Divisive hierarchical maximum likelihood clustering

Alok Sharma,
Yosvany López,
Tatsuhiko Tsunoda

Affiliations

Alok Sharma: Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences
Yosvany López: Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences
Tatsuhiko Tsunoda: Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences

DOI: https://doi.org/10.1186/s12859-017-1965-5
Journal volume & issue: Vol. 18, no. S16
pp. 139 – 147

Abstract

Read online

Abstract Background Biological data comprises various topologies or a mixture of forms, which makes its analysis extremely complicated. With this data increasing in a daily basis, the design and development of efficient and accurate statistical methods has become absolutely necessary. Specific analyses, such as those related to genome-wide association studies and multi-omics information, are often aimed at clustering sub-conditions of cancers and other diseases. Hierarchical clustering methods, which can be categorized into agglomerative and divisive, have been widely used in such situations. However, unlike agglomerative methods divisive clustering approaches have consistently proved to be computationally expensive. Results The proposed clustering algorithm (DRAGON) was verified on mutation and microarray data, and was gauged against standard clustering methods in the literature. Its validation included synthetic and significant biological data. When validated on mixed-lineage leukemia data, DRAGON achieved the highest clustering accuracy with data of four different dimensions. Consequently, DRAGON outperformed previous methods with 3-,4- and 5-dimensional acute leukemia data. When tested on mutation data, DRAGON achieved the best performance with 2-dimensional information. Conclusions This work proposes a computationally efficient divisive hierarchical clustering method, which can compete equally with agglomerative approaches. The proposed method turned out to correctly cluster data with distinct topologies. A MATLAB implementation can be extraced from http://www.riken.jp/en/research/labs/ims/med_sci_math/ or http://www.alok-ai-lab.com

Published in BMC Bioinformatics

ISSN: 1471-2105 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics; Science: Biology (General)
Website: http://www.biomedcentral.com/bmcbioinformatics/

About the journal

Abstract

Keywords