IEEE Access (Jan 2020)
Improving Maintenance-Consistency Prediction During Code Clone Creation
Abstract
Developers frequently introduce code clones into software through the copy-and-paste operations during the software development phase in order to shorten development time. Not all such clone creations are beneficial to software maintenance, as they may introduce extra effort at the software maintenance phase, where additional care is needed to ensure consistent change among these clones; i.e., changes made to a piece of code may need to be propagated to other clones. Failure in doing so may risk introducing bugs into software, which are usually called consistent defect. In response to the rampant maintenance cost caused by the introduction of new clones, some researchers have advocated the use of machine-learning approach to predict the likelihood of consistent change requirement when clones are freshly introduced. Leading in this approach is the work by Wang et al., which uses Bayesian Network to model maintenance-consistency of newly introduced clones. In this work, we leverage the success of the above-mentioned work by providing a revised set of attributes that has been shown to strengthen the predictive power of the Bayesian network model, as determined more quantitatively by the precision and recall levels. We firstly provide the definition of clone consistency-maintenance requirement, which can help transfer this problem to a classification problem. Then, based on collecting all clone creation operations through traversing clone genealogies, we redesign the attribute sets for representing clone creation with more information in code and context perspective. We evaluate the effectiveness of our approach on four open source projects with more quantitative analysis, and the experimental results show that our approach possesses a powerful ability in predicting clone consistency. To transfer this work into practice, we develop an Eclipse plug-in tool of this prediction to aid developers in software development and maintenance.
Keywords