Scientific Reports (Mar 2024)

7 Dimensions of software change patterns

  • Mario Janke,
  • Patrick Mäder

DOI
https://doi.org/10.1038/s41598-024-54894-0
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Evolving software is a highly complex and creative problem in which a number of different strategies are used to solve the tasks at hand. These strategies and reoccurring coding patterns can offer insights into the process. However, they can be highly project or even task-specific. We aim to identify code change patterns in order to draw conclusions about the software development process. For this, we propose a novel way to calculate high-level file overarching diffs, and a novel way to parallelize pattern mining. In a study of 1000 Java projects, we mined and analyzed a total of 45,000 patterns. We present 13 patterns, showing extreme points of the 7 pattern categories we identified. We found that a large number of high-level change patterns exist and occur frequently. The majority of mined patterns were associated with a specific project and contributor, where and by whom it was more likely to be used. While a large number of different code change patterns are used, only a few, mostly unsurprising ones, are common under all circumstances. The majority of code change patterns are highly specific to different context factors that we further explore.

Keywords