Applied Sciences (Feb 2025)

Deep Reinforcement Learning for Dynamic Pricing and Ordering Policies in Perishable Inventory Management

  • Yusuke Nomura,
  • Ziang Liu,
  • Tatsushi Nishi

DOI
https://doi.org/10.3390/app15052421
Journal volume & issue
Vol. 15, no. 5
p. 2421

Abstract

Read online

Perishable goods have a limited shelf life, and inventory should be discarded once it exceeds its shelf life. Finding optimal inventory management policies is essential since inefficient policies can lead to increased waste and higher costs. While many previous studies assume the perishable inventory is processed following the First In, First Out rule, it does not reflect customer purchasing behavior. In practice, customers’ preferences are influenced by the shelf life and price of products. This study optimizes inventory and pricing policies for a perishable inventory management problem considering age-dependent probabilistic demand. However, introducing dynamic pricing significantly increases the complexity of the problem. To tackle this challenge, we propose eliminating irrational actions in dynamic programming without sacrificing optimality. To solve this problem more efficiently, we also implement a deep reinforcement learning algorithm, proximal policy optimization, to solve this problem. The results show that dynamic programming with action reduction achieved an average of 63.1% reduction in computation time compared to vanilla dynamic programming. In most cases, proximal policy optimization achieved an optimality gap of less than 10%. Sensitivity analysis of the demand model revealed a negative correlation between customer sensitivity to shelf lives or prices and total profits.

Keywords