Jisuanji kexue (Dec 2022)

Empirical Study on Defects in R Programming Language and Core Packages

  • WANG Zi-yuan, BU De-xin, LI Ling-ling, ZHANG Xia

DOI
https://doi.org/10.11896/jsjkx.220200181
Journal volume & issue
Vol. 49, no. 12
pp. 89 – 98

Abstract

Read online

The R programming language that provides a variety of statistical calculation functions is considered to be one of the programming languages most suitable for artificial intelligence.The correctness of the language implementation is a prerequisite for the correctness of the programs developed with such a language.However,there are inevitably many defects in the R programming language.This paper conducts an empirical study on defects in the R programming language and its core packages.By analyzing 7020 issues,we find that:1) Among all the 35 versions involved in these defects,there are the most defects in R 3.1.2,R 3.0.2 and R 3.5.0,and these defects are primarily distributed in a few components such as Documentation,Graphics,Language.2) The components with higher overall defect priority include Startup,Installation and Analyses,and the components with higher overall defect severity include I/O,Installation and Accuracy.There is a significant intermediate correlation between the priority and severity of the defects.3) About 78% of defects could be repaired within one year.4) Semantic faults are the most frequent root cause of defects,in which the “missing feature” and “processing” are more than others.These findings reveal some laws of defects in the R programming language and its core packages.It can assist developers of the R programming language in improving their development quality,assist maintainers of the R programming language in detecting and repairing defects more effectively,and suggest users of the R programming language evade potential risks.

Keywords