IEEE Access (Jan 2024)
Is This Code the Best? Or Can It Be Further Improved? Developer Stats to the Rescue
Abstract
Is the given code the best? Or can it be further improved? And if so, by how much? To answer these three questions, code cannot be seen in isolation from it’s developer as the developer factor plays a vital role in determining code quality. However, no universally accepted metric or developer stat currently exists that provides an objective indicator to a developer’s ability to produce code benchmarked against an expert developer. While traditional developer stats like rank, position, rating and experience published on Online Judges (OJs) provide various insights into a developer’s behavior and ability, they do not help us in answering these three questions. Moreover, unless code quality can be numerically quantified this may not be possible. Towards this end, we conducted an empirical study of over 72 million submissions made by 143,853 users in 1876 contests on Codeforces, a popular OJ, to analyze their code in terms of its correctness, completeness and performance efficiency (code quality characteristics listed in the ISO/IEC 25010 product quality model) measured against the given requirements regardless of the programming language used. First, we investigated ways to predict code quality given a developer’s traditional stats using various ML regression models. To quantify and compare code quality, new concepts like score and contest scorecard, had to be introduced. Second, we identified causes that led to poor predictability. Our analysis helped classify user’s performance in contests based on our discovery of erratic or temperamental behavior of users during contests. Third, we formulated a quality index or $q\text {-}index$ of a developer, a new and unique developer stat to indicate the ability of a developer in producing quality code, and to help increase the predictability of the ML models by mitigating the negative effect of temperamental behavior of users during contests. Among the ML models used, our results suggest that the GradientBoost regressor is the most suited ML model to predict code quality which gave us a high prediction accuracy of around 99.55%. We also demonstrated the uniqueness of $q\text {-}index$ over traditional stats and described how it can complement the usefulness of traditional developer stats in decision making.
Keywords