Smart Containers with Bidding Capacity: A Policy Gradient Algorithm for Semi-Cooperative Learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

1 Citation (Scopus)
14 Downloads (Pure)


Smart modular freight containers - as propagated in the Physical Internet paradigm - are equipped with sensors, data storage capability and intelligence that enable them to route themselves from origin to destination without manual intervention or central governance. In this self-organizing setting, containers may autonomously place bids on transport services in a spot market setting. However, for individual containers it might be difficult to learn good bidding policies due to limited observations. By sharing information and costs between one another, smart containers can jointly learn bidding policies, even though simultaneously competing for the same transport capacity. We replicate this behavior by learning stochastic bidding policies in a semi-cooperative multi-agent setting. To this end, we develop a reinforcement learning algorithm based on the policy gradient framework. Numerical experiments show that sharing solely bids and acceptance decisions leads to stable bidding policies. Real-time system information only marginally improves performance; individual job properties suffice to place appropriate bids. Furthermore, we find that carriers may have incentives not to share information with the smart containers. The experiments give rise to several directions for follow-up research, in particular the interaction between smart containers and transport services in self-organizing logistics.
Key in this approach is the interplay between the degree of autonomy of logistic systems and their degree of cooperativeness. On these two pillars, a unifying framework is presented, distinguishing four fundamental categories of self-organizing logistics. To illustrate the working of the framework in practice, we present four real-life case studies, one per each category. The case studies are positioned as-is, and concrete directions for (more) self-organization are presented for each case. Moreover, possible additional dimensions of the framework, e.g., control hierarchy, system intelligence, connectivity, and predictability are discussed.
The usefulness of the framework established is two-fold: (i) it provides a common ground for researchers to position their work and to identify potential future directions for research and (ii) it serves as a practical and understandable starting point for practitioners on investigating how self-organization may affect their business and where their limited resources should be focused upon.
Original languageEnglish
Title of host publicationComputational Logistics
Subtitle of host publication11th International Conference, ICCL 2020, Enschede, The Netherlands, September 28–30, 2020, Proceedings
EditorsEduardo Lalla-Ruiz, Martijn Mes, Stefan Voß
Number of pages16
ISBN (Electronic)978-3-030-59747-4
ISBN (Print)978-3-030-59746-7
Publication statusE-pub ahead of print/First online - 22 Sep 2020
Event11th International Conference on Computational Logistics, ICCL 2020 - Online conference, Enschede, Netherlands
Duration: 28 Sep 202030 Sep 2020
Conference number: 11

Publication series

NameLecture notes in computer science


Conference11th International Conference on Computational Logistics, ICCL 2020
Abbreviated titleICCL
Internet address


  • Self-organizing logistics
  • Smart containers
  • Multi-agent reinforcement learning
  • Bidding
  • Policy gradient
  • 22/3 OA procedure


Dive into the research topics of 'Smart Containers with Bidding Capacity: A Policy Gradient Algorithm for Semi-Cooperative Learning'. Together they form a unique fingerprint.

Cite this