Applied Sciences (Jan 2018)

Pareto Optimal Solutions for Network Defense Strategy Selection Simulator in Multi-Objective Reinforcement Learning

  • Yang Sun,
  • Yun Li,
  • Wei Xiong,
  • Zhonghua Yao,
  • Krishna Moniz,
  • Ahmed Zahir

DOI
https://doi.org/10.3390/app8010136
Journal volume & issue
Vol. 8, no. 1
p. 136

Abstract

Read online

Using Pareto optimization in Multi-Objective Reinforcement Learning (MORL) leads to better learning results for network defense games. This is particularly useful for network security agents, who must often balance several goals when choosing what action to take in defense of a network. If the defender knows his preferred reward distribution, the advantages of Pareto optimization can be retained by using a scalarization algorithm prior to the implementation of the MORL. In this paper, we simulate a network defense scenario by creating a multi-objective zero-sum game and using Pareto optimization and MORL to determine optimal solutions and compare those solutions to different scalarization approaches. We build a Pareto Defense Strategy Selection Simulator (PDSSS) system for assisting network administrators on decision-making, specifically, on defense strategy selection, and the experiment results show that the Satisficing Trade-Off Method (STOM) scalarization approach performs better than linear scalarization or GUESS method. The results of this paper can aid network security agents attempting to find an optimal defense policy for network security games.

Keywords