Scientific Data (Mar 2025)

A dataset for cyber threat intelligence modeling of connected autonomous vehicles

  • Yinghui Wang,
  • Yilong Ren,
  • Hongmao Qin,
  • Zhiyong Cui,
  • Yanan Zhao,
  • Haiyang Yu

DOI
https://doi.org/10.1038/s41597-025-04439-5
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Cyber attacks pose significant threats to connected autonomous vehicles in intelligent transportation systems. Cyber threat intelligence (CTI), which involves collecting and analyzing cyber threat information, offers a promising approach to addressing emerging vehicle cyber threats and enabling proactive security defenses. Obtaining valuable information from enormous cybersecurity data using knowledge extraction technologies to achieve CTI modeling is an effective means to ensure automotive cybersecurity. However, the lack of a specialized cybersecurity dataset for automotive CTI knowledge mining has hindered progress in this field. To address this gap, we present a novel corpus specifically designed for vehicle cybersecurity knowledge mining. This dataset, annotated using a joint labeling strategy, comprises 908 real automotive cybersecurity reports, 8195 security entities and 4852 semantic relations. In addition, we conduct a comprehensive analysis of CTI knowledge mining algorithms based on this corpus. Our work provides a valuable resource for enhancing CTI modeling and advancing automotive cybersecurity research.