IEEE Access (Jan 2024)
Similarity-Based Source Code Vulnerability Detection Leveraging Transformer Architecture: Harnessing Cross- Attention for Hierarchical Analysis
Abstract
The growing complexity and volume of modern software have led to an increase in source code vulnerabilities, posing significant security risks. In response, deep learning-based automated source code vulnerability detection methods, particularly those utilizing source code similarity analysis, have recently emerged as promising solutions. However, existing similarity-based source code vulnerability detection methods frequently fail to fully utilize information from the hierarchical structure of source code and are often computationally expensive, limiting their practicality in real-world scenarios. In this paper, we introduce XTransformer, a novel deep learning-based source code vulnerability detector tailored for comparing target source code against archived vulnerable codes across various levels of the source code’s hierarchical structure by leveraging extra cross-attention imposed on the transformer architecture. Additionally, we propose a specialized training strategy based on supervised contrastive learning to improve XTransformer’s ability to effectively learn and differentiate between vulnerable and non-vulnerable source codes. Comprehensive experiments demonstrate that XTransformer outperforms current state-of-the-art methods across different datasets and code lengths while significantly reducing the inference time compared to other similarity-based methods that utilize hierarchical information from source code.
Keywords