Graph neural network based approach to automatically assigning common weakness enumeration identifiers for vulnerabilities

Peng Liu; Wenzhe Ye; Haiying Duan; Xianxian Li; Shuyi Zhang; Chuanjian Yao; Yongnan Li

doi:10.1186/s42400-023-00160-1

Cybersecurity (Nov 2023)

Graph neural network based approach to automatically assigning common weakness enumeration identifiers for vulnerabilities

Peng Liu,
Wenzhe Ye,
Haiying Duan,
Xianxian Li,
Shuyi Zhang,
Chuanjian Yao,
Yongnan Li

Affiliations

Peng Liu: Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University
Wenzhe Ye: Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University
Haiying Duan: School of Software, Beihang University
Xianxian Li: Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University
Shuyi Zhang: Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University
Chuanjian Yao: Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University
Yongnan Li: School of National Security, People’s Public Security University of China

DOI: https://doi.org/10.1186/s42400-023-00160-1
Journal volume & issue: Vol. 6, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Vulnerability reports are essential for improving software security since they record key information on vulnerabilities. In a report, CWE denotes the weakness of the vulnerability and thus helps quickly understand the cause of the vulnerability. Therefore, CWE assignment is useful for categorizing newly discovered vulnerabilities. In this paper, we propose an automatic CWE assignment method with graph neural networks. First, we prepare a dataset that contains 3394 real world vulnerabilities from Linux, OpenSSL, Wireshark and many other software programs. Then, we extract statements with vulnerability syntax features from these vulnerabilities and use program slicing to slice them according to the categories of syntax features. On top of slices, we represent these slices with graphs that characterize the data dependency and control dependency between statements. Finally, we employ the graph neural networks to learn the hidden information from these graphs and leverage the Siamese network to compute the similarity between vulnerability functions, thereby assigning CWE IDs for these vulnerabilities. The experimental results show that the proposed method is effective compared to existing methods.

Published in Cybersecurity

ISSN: 2523-3246 (Online)
Publisher: SpringerOpen
Country of publisher: Singapore
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://cybersecurity.springeropen.com/

About the journal

Abstract

Keywords