A Taxonomy for Python Vulnerabilities

Frederic C. G. Bogaerts; Naghmeh Ivaki; Jose Fonseca

doi:10.1109/OJCS.2024.3422686

IEEE Open Journal of the Computer Society (Jan 2024)

A Taxonomy for Python Vulnerabilities

Frederic C. G. Bogaerts,
Naghmeh Ivaki,
Jose Fonseca

Affiliations

Frederic C. G. Bogaerts: ORCiD; Department of Informatics Engineering, University of Coimbra, CISUC, DEI, Coimbra, Portugal
Naghmeh Ivaki: ORCiD; Department of Informatics Engineering, University of Coimbra, CISUC, DEI, Coimbra, Portugal
Jose Fonseca: ORCiD; Department of Informatics Engineering, Polytechnic Institute of Guarda, University of Coimbra, CISUC, Guarda, Portugal

DOI: https://doi.org/10.1109/OJCS.2024.3422686
Journal volume & issue: Vol. 5
pp. 368 – 379

Abstract

Read online

Python is one of the most widely adopted programming languages, with applications from web development to data science and machine learning. Despite its popularity, Python is susceptible to vulnerabilities compromising the systems that rely on it. To effectively address these challenges, developers, researchers, and security teams need to identify, analyze, and mitigate risks in Python code, but this is not an easy task due to the scattered, incomplete, and non-actionable nature of existing vulnerability data. This article introduces a comprehensive dataset comprising 1026 publicly disclosed Python vulnerabilities sourced from various repositories. These vulnerabilities are meticulously classified using widely recognized frameworks, such as Orthogonal Defect Classification (ODC), Common Weakness Enumeration (CWE), and Open Web Application Security Project (OWASP) Top 10. Our dataset is accompanied by patched and vulnerable code samples (some crafted with the help of AI), enhancing its utility for developers, researchers, and security teams. In addition, a user-friendly website was developed to allow its interactive exploration and facilitate new contributions from the community. Access to this dataset will foster the development and testing of safer Python applications. The resulting dataset is also analyzed, looking for trends and patterns in the occurrence of Python vulnerabilities, with the aim of raising awareness of Python security and providing practical, actionable guidance to assist developers, researchers, and security teams in bolstering their practices. This includes insights into the types of vulnerabilities they should focus on, the most exploited categories, and the common errors that programmers tend to make while coding that can lead to vulnerabilities.

Published in IEEE Open Journal of the Computer Society

ISSN: 2644-1268 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science; Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8782664

About the journal

Abstract

Keywords