An Evaluation of Multi-Label Classification Approaches for Method-Level Code Smells Detection

Pravin Singh Yadav; Rajwant Singh Rao; Alok Mishra

doi:10.1109/ACCESS.2024.3387856

IEEE Access (Jan 2024)

An Evaluation of Multi-Label Classification Approaches for Method-Level Code Smells Detection

Pravin Singh Yadav,
Rajwant Singh Rao,
Alok Mishra

Affiliations

Pravin Singh Yadav: ORCiD; Department of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya Bilaspur, Bilaspur, India
Rajwant Singh Rao: ORCiD; Department of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya Bilaspur, Bilaspur, India
Alok Mishra: ORCiD; Faculty of Engineering, Norwegian University of Science and Technology (NTNU), Trondheim, Norway

DOI: https://doi.org/10.1109/ACCESS.2024.3387856
Journal volume & issue: Vol. 12
pp. 53664 – 53676

Abstract

Read online

(1) Background: Code smell is the most popular and reliable method for detecting potential errors in code. In real-world circumstances, a single source code may have multiple code smells. Multi-label code smell detection is a popular research study. However, limited studies are available on it, and there is a need for a standardized classifier for reliably identifying various multi-label code smells that belong to the method-level code smell category. The primary goal of this study is to develop a rule-based method for detecting multi-label code smells. (2) Methods: Binary Relevance, Label Powerset, and Classifier Chain methods are utilized with tree based single-label algorithms, including some ensemble algorithms in this research paper. The chi-square feature selection technique is applied to select relevant features. The proposed model is trained using 10-fold cross-validation, Random Search cross-validation parameter tuning, and different performance measures are used to evaluate the model. (3) Results: The proposed model achieves 99.54% of the best jaccard accuracy for detecting method-level code smells using the Classifier Chain method with the Decision Tree. The Decision Tree model incorporating a multi-label classifier outperforms alternative approaches to multi-label classification. Single-label classifiers produced better results after considering the correlation factor. (4) Conclusion: This study will facilitate scientists and programmers by providing a systematic method for detecting various code smells in software projects and saving time and effort during code reviews by detecting multiple problems simultaneously. After detecting multi-label code smell, programmers can create more organized, easier-to-understand, and trustworthy programs.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords