Strategic bidding in freight transport using deep reinforcement learning

W. J.A. van Heeswijk*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

4 Citations (Scopus)
104 Downloads (Pure)

Abstract

This paper presents a multi-agent reinforcement learning algorithm to represent strategic bidding behavior by carriers and shippers in freight transport markets. We investigate whether feasible market equilibriums arise without central control or communication between agents. Observed behavior in such environments serves as a stepping stone towards self-organizing logistics systems like the Physical Internet, while also offering valuable insights for the design of contemporary transport brokerage platforms. We model an agent-based environment in which shipper and carrier actively learn bidding strategies using policy gradient methods, posing bid- and ask prices at the individual container level. Both agents aim to learn the best response given the expected behavior of the opposing agent. Inspired by financial markets, a neutral broker allocates jobs based on bid-ask spreads. Our game-theoretical analysis and numerical experiments focus on behavioral insights. To evaluate system performance, we measure adherence to Nash equilibria, fairness of reward division and utilization of transport capacity. We observe good performance both in predictable, deterministic settings (∼ 95% adherence to Nash equilibria) and highly stochastic environments (∼ 85% adherence). Risk-seeking behavior may increase an agent’s reward share, yet overly aggressive strategies destabilize the system. The results suggest a potential for full automation and decentralization of freight transport markets. These insights ease the design of real-world market platforms, suggesting an innate tendency of markets to reach equilibria without behavioral models, information sharing or explicit incentives.

Original languageEnglish
JournalAnnals of operations research
DOIs
Publication statusE-pub ahead of print/First online - 22 Feb 2022

Keywords

  • UT-Hybrid-D
  • Policy gradient
  • Self-organizing logistics
  • Strategic bidding

Fingerprint

Dive into the research topics of 'Strategic bidding in freight transport using deep reinforcement learning'. Together they form a unique fingerprint.

Cite this