EURASIP Journal on Wireless Communications and Networking (Dec 2017)
Data classification algorithm for data-intensive computing environments
Abstract
Abstract Data-intensive computing has received substantial attention since the arrival of the big data era. Research on data mining in data-intensive computing environments is still in the initial stage. In this paper, a decision tree classification algorithm called MR-DIDC is proposed that is based on the programming framework of MapReduce and the SPRINT algorithm. MR-DIDC inherits the advantages of MapReduce, which make the algorithm more suitable for data-intensive computing applications. The performance of the algorithm is evaluated based on an example. The results of experiments showed that MR-DIDC can shorten the operation time and improve the accuracy in a big data environment.
Keywords