Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets

Peer Nagy; Jan-Peter Calliess; Stefan Zohren; Stefan Zohren; Stefan Zohren

doi:10.3389/frai.2023.1151003

Frontiers in Artificial Intelligence (Sep 2023)

Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets

Peer Nagy,
Jan-Peter Calliess,
Stefan Zohren,
Stefan Zohren,
Stefan Zohren

Affiliations

Peer Nagy: Department of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United Kingdom
Jan-Peter Calliess: Department of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United Kingdom
Stefan Zohren: Department of Engineering Science, Oxford-Man Institute of Quantitative Finance, University of Oxford, Oxford, United Kingdom
Stefan Zohren: Man Group, London, United Kingdom
Stefan Zohren: Alan Turing Institute, London, United Kingdom

DOI: https://doi.org/10.3389/frai.2023.1151003
Journal volume & issue: Vol. 6

Abstract

Read online

We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.

Published in Frontiers in Artificial Intelligence

ISSN: 2624-8212 (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.frontiersin.org/journals/artificial-intelligence#

About the journal

Abstract

Keywords