Robust collaborative services interactions under system crashes and network failures

Lei Wang

Research output: ThesisPhD Thesis - Research UT, graduation UT

203 Downloads (Pure)


Electronic collaboration has grown significantly in the last decade, with applications in many different areas such as shopping, trading, and logistics. Often electronic collaboration is based on automated business processes managed by different companies and connected through the Internet. Such a business process is normally deployed on a process engine, which is a piece of software that is able to execute the business process with the help of infrastructure services (operating system, database, network service, etc.). With the possibility of system crashes and network failures, the design of robust interactions for collaborative processes is a challenge. System crashes and network failures are common events, which may happen in various information systems, e.g., servers, desktops, mobile devices. Business processes use messages to synchronize their state. If a process changes its state, it sends a message to its peer processes in the collaboration to inform them about this change. System crashes and network failures may result in loss of messages. In this case, the state change is performed by some but not all processes, resulting in global state/behavior inconsistencies and possibly deadlocks. In general, a state inconsistency is not automatically detected and recovered by the process engine. Recovery in this case often has to be performed manually after checking execution traces, which is potentially slow, error prone and expensive. Existing solutions either shift the burden to business process developers or require additional infrastructure services support. For example, fault handling approaches require that the developers are aware of possible failures and their recovery strategies. Transaction approaches require a coordinator and coordination protocols deployed in the infrastructure layer. Our idea to solve this problem is to replace each original process by a robust counterpart, which is obtained from the original process through an automatic transformation, before deployment on the process engine. The robust process is deployed with the same infrastructure services and automatically recovers from message loss and state inconsistencies caused by system crashes and network failures. In other words, the robust processes are transparent to developers while leaving the infrastructure unmodified. We assume a synchronous interaction scenario for collaborative processes. With this scenario, an initiator sends a request message to a responder, and waits for a response message, while a responder receives the request message, applies some state change and sends the response messages. With our proposed transformation we obtain robust processes, where each process in the responder role caches the response message if its state has changed by the previously received request message. The possible state inconsistencies are recognized by using timers and information provided by the infrastructure, and resolved by using cached state and by retrying failed interactions. We also considered more complex interaction scenarios with multiple initiator and responder instances (1-n, n-1 and n-n client-server configurations). We have provided a formal proof of the correctness of our transformation solution. We have also done a performance analysis and determined the overhead of the generated (robust) processes compared to the original processes. Since this overhead is low compared to the performance differences that exist as a consequence of using different process engines, we argue that the generated robust processes have applicability in real life business environments. By doing this work, we have learnt the possible failure situations that affect the global state/behavior of collaborative business processes. Furthermore, we have defined transformations for deriving robust processes that are capable of surviving the identified failures.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • University of Twente
  • Apers, Peter Maria Gerardus, Supervisor
  • Wieringa, Roelf Johannes, Advisor
Thesis sponsors
Award date23 Apr 2015
Place of PublicationEnschede
Print ISBNs978-90-365-3868-8
Publication statusPublished - 23 Apr 2015


  • Network failure
  • system crashe
  • Business process collaboration
  • SCS-Services
  • Process Transformation
  • process recovery
  • EWI-25972


Dive into the research topics of 'Robust collaborative services interactions under system crashes and network failures'. Together they form a unique fingerprint.

Cite this