Applied Sciences (Apr 2020)

Better Not to Use Vulnerability’s Reference for Exploitability Prediction

  • Heedong Yang,
  • Seungsoo Park,
  • Kangbin Yim,
  • Manhee Lee

DOI
https://doi.org/10.3390/app10072555
Journal volume & issue
Vol. 10, no. 7
p. 2555

Abstract

Read online

About half of all exploit codes will become available within about two weeks of the release date of its vulnerability. However, 80% of the released vulnerabilities are never exploited. Since putting the same effort to eliminate all vulnerabilities can be somewhat wasteful, software companies usually use different methods to assess which vulnerability is more serious and needs an immediate patch. Recently, there have been some attempts to use machine learning techniques to predict a vulnerability’s exploitability. In doing so, a vulnerability’s related URL, called its reference, is commonly used as a machine learning algorithm’s feature. However, we found that some references contained proof-of-concept codes. In this paper, we analyzed all references in the National Vulnerability Database and found that 46,202 of them contained such codes. We compared prediction performances between feature matrix with and without reference information. Experimental results showed that test sets that used references containing proof-of-concept codes had better prediction performance than ones that used references without such codes. Even though the difference is not huge, it is clear that references having answer information contributed to the prediction performance, which is not desirable. Thus, it is better not to use reference information to predict vulnerability exploitation.

Keywords