A Grouping of Song-Lyric Themes Using K-Means Clustering

Dionisia Bhisetya Rarasati

doi:10.31326/jisa.v3i2.658

JISA (Jurnal Informatika dan Sains) (Feb 2021)

A Grouping of Song-Lyric Themes Using K-Means Clustering

Dionisia Bhisetya Rarasati

Affiliations

Dionisia Bhisetya Rarasati: Universitas Bunda Mulia

DOI: https://doi.org/10.31326/jisa.v3i2.658
Journal volume & issue: Vol. 3, no. 2
pp. 38 – 41

Abstract

Read online

One of the automatic way of theme grouping that can be used is K-Means Clustering. In this research, the song theme is taken from the text of song lyrics. The aim of this study is developing a system that can automatically group the song lyric theme and know the accuracy level of the grouping. The process stage is started with the data processing or text processing called as text mining. In text mining, there are some processes. First, the text operation. The text operation consists of tokenizing, stopword, steeming, and word weighting then can be processed using K-Means clustering. In clustering process, it consists of initial centroid initialization uses Variance Initialization, next counts the centroid distance on the data using Euclidean distance until get the proper grouping accurately. The accuracy counting uses confusion matrix. The next step to see the suitability system that has been made, new data is added which then is processed by a system. After that, it can decide the new data is classified into one specific theme. From the research that has been conducted as case study in Masdha Radio Yogyakarta, total data available 400 and divided into four clusters. The clusters consist of love cluster, friendship cluster, religion cluster, and fighting cluster. The result of research song lyric grouping based on the theme works well with 93.25% accuracy for the unique word frequency numbers 121 maximum and unique word 0 minimum. Keywords – K-Means clustering, Text Operation, Variance Initialization, Confusion Matrix.

Published in JISA (Jurnal Informatika dan Sains)

ISSN: 2776-3234 (Print); 2614-8404 (Online)
Publisher: Program Studi Teknik Informatika Universitas Trilogi
Country of publisher: Indonesia
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software
Website: https://trilogi.ac.id/journal/ks/index.php/JISA/index

About the journal