iDNA-MS: An Integrated Computational Tool for Detecting DNA Modification Sites in Multiple Genomes
Hao Lv,
Fu-Ying Dao,
Dan Zhang,
Zheng-Xing Guan,
Hui Yang,
Wei Su,
Meng-Lu Liu,
Hui Ding,
Wei Chen,
Hao Lin
Affiliations
Hao Lv
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Fu-Ying Dao
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Dan Zhang
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Zheng-Xing Guan
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Hui Yang
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Wei Su
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Meng-Lu Liu
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Hui Ding
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
Wei Chen
Center for Genomics and Computational Biology, School of Life Sciences, North China University of Science and Technology, Tangshan 063000, China; Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
Hao Lin
Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China; Corresponding author
Summary: 5hmC, 6mA, and 4mC are three common DNA modifications and are involved in various of biological processes. Accurate genome-wide identification of these sites is invaluable for better understanding their biological functions. Owing to the labor-intensive and expensive nature of experimental methods, it is urgent to develop computational methods for the genome-wide detection of these sites. Keeping this in mind, the current study was devoted to construct a computational method to identify 5hmC, 6mA, and 4mC. We initially used K-tuple nucleotide component, nucleotide chemical property and nucleotide frequency, and mono-nucleotide binary encoding scheme to formulate samples. Subsequently, random forest was utilized to identify 5hmC, 6mA, and 4mC sites. Cross-validated results showed that the proposed method could produce the excellent generalization ability in the identification of the three modification sites. Based on the proposed model, a web-server called iDNA-MS was established and is freely accessible at http://lin-group.cn/server/iDNA-MS. : Genetics; Quantitative Genetics; Bioinformatics Subject Areas: Genetics, Quantitative Genetics, Bioinformatics