OpenCL is an open standard for parallel programming of heterogeneous compute devices, such as GPUs, CPUs, DSPs or FPGAs. However, the verbosity of its C host API can hinder application development. In this paper we present cf4ocl, a software library for rapid development of OpenCL programs in pure C. It aims to reduce the verbosity of the OpenCL API, offering straightforward memory management, integrated profiling of events (e.g., kernel execution and data transfers), simple but extensible device selection mechanism and user-friendly error management. We compare two versions of a conceptual application example, one based on cf4ocl, the other developed directly with the OpenCL host API. Results show that the former is simpler to implement and offers more features, at the cost of an effectively negligible computational overhead. Additionally, the tools provided with cf4ocl allowed for a quick analysis on how to optimize the application.
|Publication status||Published - 2016|