网络与信息安全学报 (Feb 2022)
Vulnerability identification technology research based on project version difference
Abstract
The open source code hosting platform has brought power and opportunities to software de velopment, but there are also many security risks. The open source code has poor quality, the dependency libraries of projects are complex and vulnerability collection platforms are inadequate in collecting vulnerabilities. All these problems affect the security of open source projects and complex software with open source complements and most security patches can't be discovered and applied in time. Thus, the hackers could be easily found such vulnerable software. To discover the vulnerability in the open source community fully and timely, a vulnerability identification system based on project version difference was proposed. The update contents of projects in the open source community were collected automatically, then features were defined as security behaviors and code differences from the code and log in patches, 40 features including comment information feature group, page statistics feature group, code statistics feature group and vulnerability type feature group were proposed to build feature set. And random forest model was built to learn classifiers for vulnerability identification. The results show that VpatchFinder achieves a precision rate of 0.844, an accuracy rate of 0.855 and a recall rate of 0.851. Besides, 68.07% of community vulnerabilities can be early discovered by VpatchFinder in real open source CVE vulnerabilities. This research result can improve the current issue in software security architecture design and development.
Keywords