Machine Learning with Applications (Mar 2025)
Safety analysis in the era of large language models: A case study of STPA using ChatGPT
Abstract
Can safety analysis leverage Large Language Models (LLMs)? This study examines the application of Systems Theoretic Process Analysis (STPA) to Automatic Emergency Brake (AEB) and Electricity Demand Side Management (DSM) systems, utilising Chat Generative Pre-Trained Transformer (ChatGPT). We investigate the impact of collaboration schemes, input semantic complexity, and prompt engineering on STPA results. Comparative results indicate that using ChatGPT without human intervention may be inadequate due to reliability issues. However, with careful design, it has the potential to outperform human experts. No statistically significant differences were observed when varying the input semantic complexity or using domain-agnostic prompt guidelines. While STPA-specific prompt engineering produced statistically significant and more pertinent results, ChatGPT generally yielded more conservative and less comprehensive outcomes. We also identify future challenges, such as concerns regarding the trustworthiness of LLMs and the need for standardisation and regulation in this field. All experimental data are publicly accessible.