Informatics in Medicine Unlocked (Jan 2021)
A fuzzy c-means clustering approach for haplotype reconstruction based on minimum error correction
Abstract
Studying human genetic evolution has attracted considerable attention. Haplotypes determination provides key information about human genetics, and facilitates understanding probable causal relations between traits and diseases. In general, experimental methods of haplotypes reconstruction are exorbitant in terms of time and resources. The state-of-the-art high throughput sequencing, enables leveraging computational methods for this task. However, current sequencing algorithms suffer from truncated accuracy once the error rate of their input fragment increases. In this article, we put forward FCMHap, an efficient and accurate method, which involves two steps. In the first step, it constructs a weighted fuzzy conflict graph obtained based on the similarities of the input fragments and divides the input fragments in two clusters by partitioning the graph in an iterative manner. Since the input fragments consist of noise and gaps, in the next step, it adopts the cluster centers by utilizing the fuzzy c-means (FCM) algorithm. The proposed method has been evaluated on several real datasets and compared with a selected set of current approaches. The evaluation results substantiate that this method can be an accompaniment of those approaches.