TY - JOUR
T1 - Strategic bidding in freight transport using deep reinforcement learning
AU - van Heeswijk, W. J.A.
N1 - Publisher Copyright:
© 2022, The Author(s).
PY - 2022/2/22
Y1 - 2022/2/22
N2 - This paper presents a multi-agent reinforcement learning algorithm to represent strategic bidding behavior by carriers and shippers in freight transport markets. We investigate whether feasible market equilibriums arise without central control or communication between agents. Observed behavior in such environments serves as a stepping stone towards self-organizing logistics systems like the Physical Internet, while also offering valuable insights for the design of contemporary transport brokerage platforms. We model an agent-based environment in which shipper and carrier actively learn bidding strategies using policy gradient methods, posing bid- and ask prices at the individual container level. Both agents aim to learn the best response given the expected behavior of the opposing agent. Inspired by financial markets, a neutral broker allocates jobs based on bid-ask spreads. Our game-theoretical analysis and numerical experiments focus on behavioral insights. To evaluate system performance, we measure adherence to Nash equilibria, fairness of reward division and utilization of transport capacity. We observe good performance both in predictable, deterministic settings (∼ 95% adherence to Nash equilibria) and highly stochastic environments (∼ 85% adherence). Risk-seeking behavior may increase an agent’s reward share, yet overly aggressive strategies destabilize the system. The results suggest a potential for full automation and decentralization of freight transport markets. These insights ease the design of real-world market platforms, suggesting an innate tendency of markets to reach equilibria without behavioral models, information sharing or explicit incentives.
AB - This paper presents a multi-agent reinforcement learning algorithm to represent strategic bidding behavior by carriers and shippers in freight transport markets. We investigate whether feasible market equilibriums arise without central control or communication between agents. Observed behavior in such environments serves as a stepping stone towards self-organizing logistics systems like the Physical Internet, while also offering valuable insights for the design of contemporary transport brokerage platforms. We model an agent-based environment in which shipper and carrier actively learn bidding strategies using policy gradient methods, posing bid- and ask prices at the individual container level. Both agents aim to learn the best response given the expected behavior of the opposing agent. Inspired by financial markets, a neutral broker allocates jobs based on bid-ask spreads. Our game-theoretical analysis and numerical experiments focus on behavioral insights. To evaluate system performance, we measure adherence to Nash equilibria, fairness of reward division and utilization of transport capacity. We observe good performance both in predictable, deterministic settings (∼ 95% adherence to Nash equilibria) and highly stochastic environments (∼ 85% adherence). Risk-seeking behavior may increase an agent’s reward share, yet overly aggressive strategies destabilize the system. The results suggest a potential for full automation and decentralization of freight transport markets. These insights ease the design of real-world market platforms, suggesting an innate tendency of markets to reach equilibria without behavioral models, information sharing or explicit incentives.
KW - UT-Hybrid-D
KW - Policy gradient
KW - Self-organizing logistics
KW - Strategic bidding
UR - http://www.scopus.com/inward/record.url?scp=85124869526&partnerID=8YFLogxK
U2 - 10.1007/s10479-022-04572-z
DO - 10.1007/s10479-022-04572-z
M3 - Article
AN - SCOPUS:85124869526
SN - 0254-5330
JO - Annals of operations research
JF - Annals of operations research
ER -