Work-stealing is an efficient method to implement load balancing in fine-grained task parallelism. Typically, concurrent deques are used for this purpose. A disadvantage of many concurrent deques is that they require expensive memory fences for local deque operations.
In this paper, we propose a new non-blocking work-stealing deque based on the split task queue. Our design uses a dynamic split point between the shared and the private portions of the deque, and only requires memory fences when shrinking the shared portion.
We present Lace, an implementation of work-stealing based on this deque, with an interface similar to the work-stealing library Wool, and an evaluation of Lace based on several common benchmarks. We also implement a recent approach using private deques in Lace. We show that the split deque and the private deque in Lace have similar low overhead and high scalability as Wool.
|Lecture Notes in Computer Science
|Springer International Publishing
|7th International Euro-Par Workshop on Multi-/Many-core Computing Systems, MuCoCoS 2014
|26/08/14 → 26/08/14
|26 August 2014
- non-blocking deque
- lock-free algorithm
- Dynamic load-balancing
- task-based parallelism