IEEE Access (Jan 2024)

LinFuzz: Program-Sensitive Seed Scheduling Greybox Fuzzing Based on LinUCB Algorithm

  • Yinghao Su,
  • Dapeng Xiong,
  • Ying Wan,
  • Chenghao Shi,
  • Qingyao Zeng

DOI
https://doi.org/10.1109/ACCESS.2024.3404918
Journal volume & issue
Vol. 12
pp. 74843 – 74860

Abstract

Read online

The mutation-based greybox fuzz testing technique is one of the widely used dynamic vulnerability detection techniques. It generates testcases for testing by mutating input seeds. In the process of fuzz testing, the seed scheduling strategy and energy scheduling strategy impact the test results and efficiency. Existing seed scheduling strategies, however, only consider a few specific seed attributes and ignore contextual information during seed execution. This oversight makes it challenging to prioritize the selection of suitable seeds based on historical fuzz test results. Meanwhile, current methods for calculating coverage lack evaluation of software paths, which makes it easy to waste time on testing high-frequency and low-risk paths. This article proposes a new greybox fuzzing scheme, LinFuzz, which transforms the seed scheduling problem into a contextual multi-armed bandit machine model. It utilizes the LinUCB algorithm to assess the value of seeds for scheduling by considering their historical execution information. At the same time, LinFuzz improves the calculation method for fuzz testing path rewards and the seed energy scheduling algorithm. It allocates more energy for testing low-frequency paths in the testing program, thereby enhancing the efficiency of exploration and the path coverage ability of the testing tool. This article evaluated the proposed LinFuzz on 12 real programs in comparison with other open-source tools such as AFL, AFLFast, FairFuzz, Neuzz, etc. The results show that under the same testing time budget, LinFuzz outperforms other tools in terms of vulnerability discovery quantity and code coverage ability. Compared with complex fuzz testing optimization algorithms, LinFuzz has lower memory consumption and time complexity.

Keywords