Proceedings of the XXth Conference of Open Innovations Association FRUCT (Nov 2017)
Prediction of Common Weakness Probability in C/C++ Source Code Using Recurrent Neural Networks
Abstract
The article considers source code written in C/C++ programming language. The problem is the automatic detection of potential vulnerabilities from the common weakness enumeration. The assumption is that the presence of a vulnerability is determined by the local context. Machine learning approaches based on recurrent neural networks are investigated. The training sample is built from known common weakness fixes in public software code repositories. A new static analysis approach based on recurrent neural networks is proposed. It is tested on source code blocks with different sizes and demonstrates good quality in the terms of accuracy, F1 score, precision and recall. The proposed method can be used as a part of the source code quality analysis system and can be improved for more deeply source code analysis or for collaboration with source code autofixing tools.