JISA (Jurnal Informatika dan Sains) (Jun 2022)
Grouping of Village Status in West Java Province Using the Manhattan, Euclidean and Chebyshev Methods on the K-Mean Algorithm
Abstract
The Ministry of Villages, Development of Disadvantaged Areas and Transmigration (Ministry of Village PDTT) is a ministry within the Indonesian Government in charge of rural and rural development, empowerment of rural communities, accelerated development of disadvantaged areas, and transmigration. Village Potential Data for 2014 (Podes 2014) in West Java Province is data issued by the Central Statistics Agency in collaboration with the Ministry of Village PDTT which is in unsupervised data format, consists of 5319 village data. The Podes 2014 data in West Java Province were made based on the level of village development (village specific) in Indonesia, by making the village as the unit of analysis. Base on the Regulation of the Minister of Villages, Disadvantaged Areas and Transmigration of the Republic of Indonesia number 2 of 2016 concerning the village development index, the Village is classified into 5 village status, namely Very Disadvantaged Village, Disadvantaged Village, Developing Village, Advanced Village and Independent Village based on the ability to manage and increase the potential of social, economic and ecological resources. Village status is in fact inseparable from village development that is under government funding support. However, village development funds have not been distributed effectively and accurately according to the conditions and potential of the village due to the lack of clear information about the status of the village. Therefore, the information regarding the villages priority in term of which villages needs more funding and attention from the government is still lacking. Data mining is a method that can be used to group objects in a data into classes that have the same criteria (clustering). One of the algorithms that can be used for the clustering process is the k-means algorithm. Data grouping using k-means is done by calculating the closest distance from data to a centroid point. In this study, different types of distance calculation in the K-means algorithm are compared. Those types are Manhattan, Euclidean and Chebyshev. Validation tests have been carried out using the execution time and Davies Bouldin index. From this test, the data Village Potential 2014 in West Java province have grouped all the 5 status of the village with the obtained number of villages for each cluster is a cluster village Extremely Backward many as 694 villages, cluster Villages 567 villages, cluster village Evolving as much as 1440 villages, the cluster with Desa Maju1557 villages and the cluster Independent Village for 1061 villages. For distance calculation, Chebyshev has the most efficient accumulation time of 1 second compared to Euclidean 1.6 seconds and Manhattan 2.4 seconds. Meanwhile, the Euclidean method has the value, Davies Index most optimal which is 0.886 compared to the Manhattan method 0.926 and Chebyshev 0.990.
Keywords